Upload
merilyn-taylor
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Bin Packing Problem: A parallel implementation
By Christos GiavroudisDissertation submitted in partial fulfilment for the degree of
Master of Science in Communication & Information Systems
Department of Informatics & CommunicationsTEI of Central Macedonia
Suppose you need to place small objects in large containers with fixed size in order to have the minimum possible space, to use the fewest containers and the whole process can be done as soon as possible.
It is an optimization problem and belongs to combinatorial NP-hard problem
Bin Packing Problem: Definition
There is a list of numbers called “weights”
These numbers represent objects that need to be packed into “bins” with a particular capacity
The goal is to pack the weights into the smallest number of bins possible
Bin Packing Problem: Definition
They have many applications, such as:
Objects coming down a conveyor belt need to be packed for shipping
A construction plan calls for small boards of various lengths, and you need to know how many long boards to order
Tour groups of various sizes need to be assigned to busses so that the groups are not split up
Placing computer files with specified sizes into memory blocks of fixed size.
the recording of a composer’s music where the length of the piece to be recorded are the weight and the bin capacity is the amount of time that can be stored on audio CD ( 80 minutes) and so on.
Bin Packing Problem: Applications
There are many variations of this problem such as 1-D, 2-D, 3-D, linear programming packing by weight, packing by cost and so on. Bin Packing Problem , because of the high diversity, encloses many area of our lives.
Bin Packing Problem: Variations
In literature there have been developed many heuristics algorithms to solve the Bin Packing Problem. Briefly, we mention that there are two major categories: the classified methods and the unclassified methods.
The classified are Next Fit (NF), First Fit (FF), Best Fit (BF) and the classified are Next Fit Decreasing (NFD), First Fit Decreasing (FFD), Best Fit Decreasing (BFD). Select to be presented in this thesis Best Fit Decreasing (BFD).
Bin Packing Problem: Classified Algorithm.
Pack the weights 5, 7, 3, 5, 6, 2, 4, 4, 7, and 4 into bins with capacity 10
Bin Packing Problem: Example
Bin Packing Problem: Example Pack the weights 5, 7, 3, 5, 6, 2, 4, 4, 7, and
4 into bins with capacity 10
There are many possible solutions
Bin Packing Problem: Example Pack the weights 5, 7, 3, 5, 6, 2, 4, 4, 7, and
4 into bins with capacity 10
There are many possible solutions
Bin Packing Problem: Example Pack the weights 5, 7, 3, 5, 6, 2, 4, 4, 7, and
4 into bins with capacity 10
There are many possible solutions
We saw a solution with 5 bins
Is that the best possible solution?
If we add up all the weights, we get5+7+3+5+6+2+4+4+7+4 = 47
Lower bound is 47/10 =4.7where 10 is the capacity and 47 is the sum of all
weights.
So, the best we can hope for is 5 bins.
Bin Packing Problem: Example
One heuristic method for packing weights is to look for these “best fits”
Consider all of the bins and find the bin that can hold the weight and would have the least room leftover after packing it.
One-at-a time algorithm We have to decide what to do with each weight,
in order, before moving on to the next one
Best Fit Decreasing Algorithm: Serial Approach
Best Fit Decreasing Algorithm: Serial approach Let’s start over and use the best fit decreasing
algorithm
Sort the list of weights from the biggest to the smallest
This time we need to keep track of how much room is left in each bin. When we consider a weight, we look at all the bins that have room for it…
Best Fit Decreasing Algorithm: Serial approach
…and put it into the bin that will have the least room left over
In this case, we only have one bin, so the 7 goes in there.
Best Fit Decreasing Algorithm: Serial approach
For the next weight, we don’t have a bin that has room for it, so we make a new bin
Best Fit Decreasing Algorithm: Serial approach
3
Best Fit Decreasing Algorithm: Serial approach So, we make a new bin
None of our bins have room for our next weight
So we make a new bin
Best Fit Decreasing Algorithm: Serial approach
None of our bins have room for the next weight
So we make a new bin.
Best Fit Decreasing Algorithm: Serial approach
5
Bin #4 has the least room left over, so that’s where we put our next weight
Best Fit Decreasing Algorithm: Serial approach
Bin #3 has the least room left, so that’s where we put our next weight
Best Fit Decreasing Algorithm: Serial approach
The next weight doesn’t fit into any of our bins, so we need to make a fifth bin
Best Fit Decreasing Algorithm: Serial approach
The next weight fit into Bin#5
The next weight can go into either Bin#1or Bin#2.
Best Fit Decreasing Algorithm: Serial approach
The last weight fits exactly into Bin#5.
The lower bound is:7+7+3+4+6+5+5+4+4+2=47/10=4.7
We got the best solution. This is called an optimal solution.
Best Fit Decreasing Algorithm: Serial approach
If the list of weights is very long, or if the bin capacity is very large, this can be impractical.
The weights are tasks that need to be completed
The bins are “processors,” which are the agents (people, machines, teams, etc.) that will actually perform the tasks
Bin Packing Problem: Parallel Approach
Let’s start over and use the best decreasing fit algorithm in parallel implementation and using two cores with the same example.
Each core uses a bin.
The data set is divided into two parts with a cyclic data partition.
Each piece of data is mapped into a core (bin).
Best Fit Decreasing Algorithm: Parallel Approach
We have the same example
Sort the list of weights from the biggest to the smallest.
Best Fit Decreasing Algorithm: Parallel Approach
This time we need to separate the data set into two parts.
Best Fit Decreasing Algorithm: Parallel Approach
When we consider a weight, we look at all bins that have room for it, and put it into the bin that will have the least room left.
Best Fit Decreasing Algorithm: Parallel Approach
Now, we look the next weight separate of our data parts. None of our bins have room for the next weight.
Best Fit Decreasing Algorithm: Parallel Approach
So we make a new bin both of data partition.
Best Fit Decreasing Algorithm: Parallel Approach
Now, we put it into new bins per core that will have the least room left.
Best Fit Decreasing Algorithm: Parallel Approach
we examine next weights separate and simultaneous.
Best Fit Decreasing Algorithm: Parallel Approach
We put them into Bin#3 and Bin#4 respectively because these bins will have the least room left.
Best Fit Decreasing Algorithm: Parallel Approach
After that, we examine again next weights separate and simultaneous.
Best Fit Decreasing Algorithm: Parallel Approach
So we make new bins and put them into.
Best Fit Decreasing Algorithm: Parallel Approach
The last weights have row. Bin #1 has the least room left over for the first part, so that’s where we put our weight. Alike, Bin#2 has the least room left over for the second part, so that’s where we put our weight
Best Fit Decreasing Algorithm: Parallel Approach
We have the final result below
Best Fit Decreasing Algorithm: Parallel Approach
This is a different result from the best fit decreasing algorithm. This is called also an optimal solution. The lower bound for the first part is 7+5+5+4+3=24/10=2.4. Alike for the second part the lower bound is 7+6+4+4+2=23/10=2.3.
We notice that the parallel implementation of algorithm use one more bin than serial, but it is executed in less steps.
Best Fit Decreasing Algorithm: Parallel Approach
Will the serial execution of algorithm or the parallelization of algorithm be beneficial to us?
From the previous examples, we can consider that: For small datasets is preferable to use the
serial algorithm.
For huge datasets is preferable to use the parallel algorithm.
Parallel vs Serial Approach
But, you should run the algorithm serial and then in parallel on different datasets.
In parallel execution, we will execute each dataset to varying number of processors in order to compare our results.
Only then, we say whether parallel processing is a beneficial or not.
Parallel vs Serial Approach
Parallel computing is the use of multiple processors to execute different parts of the same program concurrently.
A parallel computer is a collection of processing elements, that can solve big problems quickly by means of well coordinated collaboration.
Parallel Computing: Definations
The implementation and the execution of algorithms made in the laboratory "Parallel and Distributed Processing" at the premises of the department of Computer Engineering of the Technological educational institution of Central Macedonia.
For the run of Algorithm computers were used (total 32 Processors), each processor has a 2.4 GHz. We used the software package MATLAB (version 6.2).
MATLAB- Parallel Computing Toolbox™
To run applications in MATLAB Distributed Computing Server must be done the following steps:
Installation of the MDCE
Run of MDCE
Set up of the network with the admincenter
MATLAB Computing Distributed Server
First, we set the current directory as\MATLAB\R2010a\toolbox\distcomp\bin Afterwards, we run the in command
window of MATLAB the following command!mdce install Finally, we enable the MDCE in command
window of MATLAB with the following command
!mdce start
Installation and activation of MATLAB Computing Distributed Server
We run the command admincenter outside of the MATLAB environment.
This commands occurs in directory \MATLAB\R2010a\toolbox\distcomp\bin
Set up of the network
We select the menu Add or Find to define the Hosts of the network.
Set up of the network
We create a Jobmanager by the menu Start in Jobmanager part,
which is called “BinPackingMan”. In this JobManager, we set up eight hostnames with 4 cores per hostname (PC).
Set up of the network
Close the admincenter window and we set by the menu parallel the Jobmanager.
Set up of the network
Parallel Computing Toolbox™ lets us solve computationally and data-intensive problems using multicore processors, GPUs, and computer clusters.
Totally, we had 32 cores. Then, we could run and executed Best Fit Decreasing algorithm parallel. We took the following useful conclusions on the execution time, the total wastage, the total number of bins used, the speed up and the efficiency on different size of datasets.
MATLAB- Parallel Computing Toolbox™
function bfd(numberOfPackages,binSize,totalBins,package,totalWastage)tic i=1;while i<=numberOfPackages currentPackage=0; while(currentPackage<binSize && i<=numberOfPackages) min=300; position=0; %position of minimun wastage for j=totalBins:-1:1 if wastage(j)-package(i)>=0 && wastage(j)-package(i)<min min=wastage(j)-package(i); position=j; end end a=binSize-(currentPackage+package(i)); if a>=0 && a<min position=0; end if position~=0 && wastage(position)~=0 wastage(position)=wastage(position)-package(i); i=i+1; else if currentPackage+package(i)<=binSize currentPackage=currentPackage+package(i); i=i+1; else break end end end totalBins=totalBins+1; wastage(totalBins)=binSize-currentPackage;end
Best Fit Decreasing Algorithm
for i=1:totalBins totalWastage=totalWastage+wastage(i);endtimerValue=toc;str=sprintf('Total wastage: %d', totalWastage);disp(str)str=sprintf('Total bins used: %d', totalBins);disp(str)str=sprintf('Total time elapsed: %f', timerValue);disp(str)str=sprintf('Average values: %f', mean(package));disp(str)str=sprintf('Standard deviation: %f\n', std(package));disp(str) end
Best Fit Decreasing Algorithm: (continue)
clear;clc;binSize=100; % stable size of binfor i=[2^10 2^12 2^14 2^16] numberOfPackages=i; % number of package for packing packarray=randi(binSize-1,numberOfPackages,1); %table of packages random size package=sort(packarray); %sort data table str=sprintf('dataset_%d',i); save(str);end
Function create datasets
function a=data_partition(labs,package)n=length(package);for i=1:labs for j=1:n/labs k=j+1;
a(i,j)=package(labs*(k-1-mod(i,labs));
endend
Function data patition
function c=calc_wastage(binSize,data)for j=1:length(data)/labs k=j+1; min=300; c(i,j)=min+sub_package(labs*(k-1)-mod( i,labs));end
Function of calculation wastage
str=sprintf('-Best Fit Decreasing Algorithm-\n------------------\n');
disp(str);
str1=sprintf('dataset_%d',2^10);
load(str1);
totalBins=0; %total number bins used
totalWastage=0; %total wastage
bfd(numberOfPackages,binSize,totalBins,package,totalWastage)
n=length(package);for tloop=1:10for labs=[1,2,4,8,16,32]
matlabpool('open', 'local', labs);
tic
totalBins=0; %total number bins used
totalWastage=0; %total wastage
sub_package=data_partition(labs,package);
str=sprintf('----Distributed Best Fit Decreasing Alogorithm for clones:---%d',labs);
disp(str)
part=n/labs;
Function of parallelization
spmd(labs) for p=1:labs temp=sub_package(p,:); if (labindex==p) bfd(part,100,0,temp,0); end end end tt(tloop,labs)=toc; matlabpool('close');endendfor labs=[1,2,4,8,16,32] at(labs)=(sum(tt(:,labs))-max(tt(:,labs))-min(tt(:,labs)))/8endsave results_2_10
Function of parallelization: continue
Explanation of CodeThe command matlabpool enables the parallel language features in the MATLAB language by starting a parallel job that connects this MATLAB client with a number of labs.
matlabpool('open', 'local', labs);
starts a worker pool using the local parallel configuration and the number of the available labs is labs.
matlabpool('close');
stops the worker pool, destroys the parallel job, and makes all parallel language features revert to using the MATLAB client for computing their results.
Total wastageDataset serial core 1 core 2 core 4 core 8 core 16
2^10 dataset 14560 14560 14760 14660 14960 15160
2^12 dataset 57980 57980 58080 58080 58580 58580
2^14 dataset 231264 231264 231364 231464 231864 231964
2^16 dataset 933434 933434 933534 933634 936934 934434
Results
Diagram of total wastage
1 2 3 4 5 60
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
2^10 dataset2^12 dataset2^14 dataset2^16 dataset
Total used Bins
Dataset serial core 1 core 2 core 4 core 8 core 16
2^10 dataset 665 665 667 666 669 671
2^12 dataset 2634 2634 2635 2635 2640 2640
2^14 dataset 10478 10478 10479 10480 10484 10485
2^16 dataset 42004 42004 42005 42006 42006 42014
Diagram of total used bins
1 2 3 4 5 60
5000
10000
15000
20000
25000
30000
35000
40000
45000
2^10 dataset2^12 dataset2^14 dataset2^16 datasetB
ins u
sed
Total execution timeDataset core 1 core 2 core 4 core 8 core 16
core 32
2^10 dataset 0,1082 0,093 0,105 0,1238 0,1584 0,2307
2^12 dataset 0,2883 0,1418 0,1257 0,1453 0,1953 0,2763
2^14 dataset 3,1 0,86056 0,3377 0,257 0,3459 0,5208
2^16 dataset 47,7559 12,1252 3,405 1,2479 1,0573 1,5077
Diagram of Total execution time
core 1 core 2 core 4 core 8 core 16 core 320
3
6
9
12
15
18
21
24
27
2^10 dataset2^12 dataset2^14 dataset2^16 dataset
Tim
e
Thank you!