






Dr. Benny Raphael
People always ask me, how I came up with this strange algorithm with four nested cycles and funny ways to update a probability distribution function (PDF). After repeating the story many times, I decided to write it down. But I never got time to do it until the Christmas break of 2003. I realised that Christmas is the time to write about the birth of an algorithm that might provide a new light to this dark world of difficult optimisation problems. PGSL was born after a long journey through spacetime, from Madras to Lausanne through space, from 1991 to 2003 in time. I will briefly summarise this journey.
Dr. Rajeev was a lecturer at that time and was also doing his PhD under the guidance of Prof. C.S.Krishnamoorthy. He did his PhD in Genetic algorithms (GA). I first learnt about GAs through him. I never could believe that you can find the optimum by randomly cutting and pasting parts of solutions and randomly flipping bits. Dr. Rajeev's answer was that the convergence of simple genetic algorithm has already been proved. I always wanted to experiment with GAs to find out how it really worked and to test if it could be used to find solutions to nonlinear equations. Again, I did not have to time to do these experiments during my MS.
My experiments with SGA helped me understand probabilistic algorithms better. I found that the probability of getting a better solution through crossover is very low (for nonlinear continuous variables). Faster improvements were obtained by randomly changing the value of a single variable at a time. There is no need to combine two solutions to get a better solution! Using this technique I was able to get faster convergence. This procedure inevitably resulted in local minima, nevertheless, I was happy that the procedure was much more reliable than the Newton Raphson method. During my attempts to avoid local minima without slowing down convergence, I realised that the size of the search space was crucial. After many experiments, I developed an algorithm called Quantum Genetic Search (QGS). The idea was to reduce the size of the search space by restricting movements to discrete positions along each axis (similar to energy levels in quantum mechanics). QGS worked well for certain types of problems, but required lot more work to be used as a general purpose optimisation program. During this time, my research focus shifted to case based reasoning and I never got time to continue experimentation with optimisation. In fact, I even lost the QGS code. QGS was left to die and decay in one of my computers at Glasgow. The only remains of QGS that has been found is a report that I wrote during this period. (Bimal Kumar still believes that PGSL is the same as QGS. Actually, I did not realise the similarities between QGS and PGSL until Bimal Kumar mentioned about it to me at a conference in Iceland).
In 1998, I was working on an industrial project sponsored by Karl Steiner Engineering. This project involved probabilistic analysis. I first implemented MonteCarlo simulation for computing probabilities. I was not happy with the results because I was not getting "perfect triangles" when the theoretical distributions were triangles (when two rectangles are added). Then I developed my own algorithm for computing PDFs through numerical integration. This algorithm is very efficient when the dependency graph is a tree structure. However, when there are repeated variables the algorithm breaks down due to exponential complexity with respect to the number of repeated variables. What struck me was the robustness of MonteCarlo simulation  you can always get an approximate solution. The solutions are reasonably accurate where probabilities are large  the difficulty is only in regions containing small probabilities. On the other hand, numerical integration is extremely accurate but fails down completely beyond a few (repeated) variables.
The success of montecarlo prompted me to think about applying it to optimisation. My initial idea was to generate a PDF representing good regions through sampling and then use this PDF to generate more solutions in good regions. I was very convinced that this would work, but resisted temptations to start this experiment. From my previous experience, I knew how I could get sucked into it and coudl spend weeks and months working on it.
In 1999, Yvan RobertNicoud joined the laboratory and started his PhD in the area of bridge diagnosis. I was supervising his work and we used to spend lot of time discussing about his research. At that time, he was trying to find the best values of parameters of the Lutrive bridge that produced (predicted) deflections matching measurements. One day, I saw something interesting on his computer. He had an excel sheet in which the value of Young's modulus (E) varied in constant steps along one column and in another column he had the value of the predicted deflection. He had implemented a complete statically indeterminate structural analysis in a spread sheet (including numerical integration)! He showed me that he can find the best value of E that produced a good match with measurements using this spreadsheet. I thought that this was remarkable. It struck me that he was actually performing optimisation in a single variable. Now I could not wait any longer for my optimisation experiments. I told him that I will develop a better tool for this task and he will be able to handle multiple variables with this tool.
So in 1999, the first version of PGSL was born. That time, I called it PGS (Probabilistic Global Search). This version had only two cycles. Sampling and probability updating. It did not have good speed of convergence. So I added another cycle for faster convergence. This was the focusing cycle. For parameterestimation of Lutrive bridge, this was enough. But when I tested on highly nonlinear functions, I found that it got trapped in local minima most of the time. So I introduced the subdomain cycle to restart the search. After many trials and errors I had the near perfect algorithm!! Now the algorithm had many small details that made it work well on a wide variety of problems. There were conditions to prevent premature convergence, there were mechanisms to avoid too much focused search in small regions, etc.
In the beginning, PGS remained a secret  only Yvan knew about its existance. This was because of two reasons. My earlier algorithm for numerical integration of probabilities was overhyped and in the end was not very useful. Also, I did not know if similar algorithms existed already. I did a certain amount of search and could not find any similar algorithm. Finally, I decided that the best way to find out is to publish it. I wrote the first paper on PGSL with Prof. Smith. He suggested that a four letter acronym is better than a three letter one. So PGS was changed to PGSL, Probabilistic Global Search, Lausanne, to give credit to the place where it was developed. The first paper was published at the EGSEAI conference in Reykjavik, 2000. At that time I did not know about algorithms such as adaptive random search and controlled random search. In fact, I knew about these algorithms after I got comments from anonymous reviewers. Now I know that PGSL belongs to a class of algorithms based on sampling using a PDF. The uniqueness of PGSL is in the procedure to update the PDF in four nested cycles.
There has been many developments in PGSL during the last three years. The basics of the algorithm have remained the same. However, small details have been changed in order to perform well in a wide range of problems. The number of parameters to be tuned has also been reduce to one from four. In PGSL version 3, the only parameter that needs to be specified is the maximum number of evaluations of the objective function. All other parameters are automatically computed. There is also a java version of PGSL, called SPGSL (Simplified PGSL). The algorithm has been simplified (at the cost of reliability) and is better suited for smoother objective functions.
A foot note
I was eventually able to solve the nonlinear equation problem using gradients. The trick is to use a combination of linesearch and gradients: Compute the gradient, perform a line search in this direction, get the best point, compute the gradient at this point and repeat until convergence is reached. This procedure is very efficient and does not suffer from stability problems.