• Admin
    UC Research Repository
    View Item 
       
    • UC Home
    • Library
    • UC Research Repository
    • College of Engineering
    • Engineering: Theses and Dissertations
    • View Item
       
    • UC Home
    • Library
    • UC Research Repository
    • College of Engineering
    • Engineering: Theses and Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of the RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    Statistics

    View Usage Statistics

    Hyper-parameter and kernel optimization in support vector machines.

    Thumbnail
    View/Open
    Yamada, Shinichi_Final PhD Thesis.pdf (4.031Mb)
    Author
    Yamada, Shinichi
    Date
    2018
    Permanent Link
    http://hdl.handle.net/10092/16407
    Thesis Discipline
    Computer Science
    Degree Grantor
    University of Canterbury
    Degree Level
    Doctoral
    Degree Name
    Doctor of Philosophy

    Support vector machines (SVMs) are capable of producing high quality solutions for many types of real-world classification problems. SVMs classify the data by selecting hyperplanes that provide the maximum margin between classes. Intuitively, the farther a data point from the decision boundary (hyperplane), the lower the chances of making a false prediction. A fundamental theorem of SVMs proves that the generalization error rates of SVMs are independent of the dimensionality of the input space and they only depend on the ratio of the margin and the radius of spheres which contain all the data. The expressive power of SVMs can be extended by mapping input data into high dimensional spaces. In the high dimensional spaces, the computational complexity of SVMs does not increase much, because all of the computation is done through the kernels—“the kernel trick”. The maximal margin principle and the kernel trick make SVMs useful and efficient computational tools in machine learning.

    While the parameters (support vectors) of SVMs can be found efficiently, the choice of kernels and optimal values of hyper-parameters within the kernels remains an open problem. This thesis addresses the issues of finding the optimal hyper-parameters and constructing the optimal kernel. Since the search spaces of these problems are very large, and often lack properties such as convexity that can be exploited, the thesis tackles the problem by proposing a set of tools that are based on (or influenced by) meta-heuristic algorithms and evolutionary computation.

    For the hyper-parameter optimization, we develop a surface estimation method using the B´ezier curve method. Firstly we determine the search region and choose samples on a regular grid in the region. Then we construct a surface which interpolates the prediction accuracy on sampling points and detects the most relevant region for a finer second search. The proposed method makes the search process more efficient and automatic, and it can be applied to multi-dimensional spaces.

    The optimization of hyper-parameters can be further improved by making the detection of the initial search region automatic and by making the search process within this region more efficient. We achieve this goal by taking an approach based on evolutionary computation. Particle Swarm Optimization (PSO) is an algorithm inspired by the social behaviour such as in a bird flock and fish school. We propose a unified approach for discretization of continuous PSO. Currently binary PSO and continuous PSO are two separate subjects, and most of the theories developed for continuous PSO can not be applied to binary PSO. We propose a general model to combine these two subjects and derive several binary PSO models and discrete PSO models (where each dimension can have more than two values) from existing continuous PSO models.

    We apply the proposed discrete PSO to the hyper-parameter search in SVMs. In order to further increase the efficiency, we introduce an adaptive calibration of the evaluation points. We also present a simple algorithm to automatically adjust the search region using PSO. In the experiments, we show that the proposed discrete PSO is more efficient than surface estimation methods.

    For constructing the optimal kernel, we propose a new multiple kernel learning (MKL) method which minimizes the ratio of the radius and the margin rather than just maximizing the margin. The proposed method has several favorable properties. It does not need to tune additional parameters and returns high quality solutions comparable to the best solutions of traditional MKL methods with fine tuned parameters.

    The application of MKL does not always improve the prediction accuracy of simple SVMs with fine-tuned parameters. To improve the prediction accuracy of SVMs, we construct optimal kernels from possible nonlinear combinations of kernels and subsets of features. We propose a system of constructing optimal kernels using Genetic Programming and binary PSO. In the experiments, we show that the proposed method drastically improves the prediction accuracy over SVMs with fine-tuned parameters.

    It is found that for both tasks of hyper-parameter optimization and optimal kernel construction, the proposed methods solve the difficulties imposed by standard methods in SVMs. This is achieved by combining the standard methods with meta-heuristic algorithms which improves the performance in terms of accuracy and efficiency.

    Collections
    • Engineering: Theses and Dissertations [2163]
    Rights
    https://canterbury.libguides.com/rights/theses

    UC Research Repository
    University Library
    University of Canterbury
    Private Bag 4800
    Christchurch 8140

    Phone
    364 2987 ext 8718

    Email
    ucresearchrepository@canterbury.ac.nz

    Follow us
    FacebookTwitterYoutube

    © University of Canterbury Library
    Send Feedback | Contact Us