• Mathematics
    VectorsMatricesVector SpaceAlgebraCalculusLogicAnalytic GeometryStatistics & ProbabilityGeometrySpecial Functions
    Physics
    MechanicsUnits & Constants
    Electronics
    Circuits
    Computer Science
    EncodingComputerLanguages
    Machine Learning
    ClusteringOptimizationRegressionKernels
    AI
    Neural Network
    Finance
    OptionsFixed IncomeMarket Analysis
    Help
    Contact usIndexSearchVersion historyFormula Syntax
    Practice Quiz
    LoginUser
  • Machine Learning
    Clustering: Support Vector Machine
    • Clustering
      • K Means
      • Fuzzy C-Means
      • COBWEB
      • Support Vector Machine
    • Optimization
      • Regression
        • Kernels
          Support Vector machine

          Support Vector machine (SVM) is a supervised learning method for classification iwo or more sets of data points by finding the boundaries between them.
          If we are given two sets of data points Xj={xi | xi ∈ ℝd} , j = 1,2 , SVM finds the hyperplane h(x1, ... , xd) = w1x1 + ... + wdxd + b = 0
          that optimally separates them. If denote hyperplane h as h(x) = wT x + b , then SVM tries to
          minimizew,b,C,ζ    12 ‖w‖2 + C
          N
          ∑
          ζi
          i = 1
          , subject to constraint yi(wT xi - b) ≥ 1 - ζi ,   ζi ≥ 0 , for all i = 1,..., N
          where N is the number of data points xi, ζi slack (error) parameter of data point xi, margin tolerance parameter C and yi = { 1   , xi ∈ X1-1   , xi ∈ X2
          In order to opimize the calculations, system is using the related Sequential Minimal Optimization (SMO) algorithm which tries to find the solution to
          maximizeα   
          N
          ∑
          αi
          i = 1
          - 12
          N
          ∑
          i = 1
          N
          ∑
          αi αj yi yj K(xi,xj)
          j = 1
          , subject to constraint 0 ≤ αi ≤ C , i = 1,...,N   and
          N
          ∑
          αi yi
          i = 1
          = 0
          where K is the Kernel (transformation) function and αi are Lagrange multipliers. The default kernel function is the scalar product of two vectors.
          By default, the system will automatically try to find the best values for C and ε which is the boundary for alphas, but you can explicitly specify it as well.
          In case that there are more then two classes of data that need to be separated, the process is finding 1-1 separation between each pair of classes.
          Number of Classes: DimensionKernel:  
          Parameters C and ε are  


          C   :  
          ε   :  
          Please Define Classes:
          1
          2
          Please Enter Data:
          kClassxk,1xk,2
          1
          2
          3
          4
          5
          6
          7
          8
          9
          10

          Support Vector Machine (SVM)

          Calculate Class separation boundaries using the Support Vector Machine of the data provided in the file to be uploaded.

          Input File format:
          • File must be an ASCII file in comma separated values format (CSV).
          • File can contain multiple sets of data points whose separation you want to find.
          • First line of each set of the datapoints is preceded by a keyword SVM (case insensitive). Anything in the line after that keword will be ignored.
          • Second line is preceded by a keyword TYPE (case insensitive), followed (comma separated) by the value LINEAR for the linear separation boundary.
          • Next line can be optional and can contain a sepecification of the parameter C by containing the keyword C (case insensitive) followed by its value.
            If ommited, system automatically generates a set of most probable values for C.
          • Next line can be optional and can contain a sepecification of the parameter ε by containing the keyword E (case insensitive) followed by its value.
            If ommited, system automatically generates a set of most probable values for ε.
          • Next line must contain a keyword CLASSES (case insensitive) followed by the integer m, specifying the number of differnt Data Classes.
            After the that, next m lines specify the labels for each of the Data Classes.
          • After that comes the line containing the keyword DATA (case insensitive) followed by the integers n and d, specifying the number n of
            the Data Points and the Data Point Dimension d.
          • Finally comes n lines, one for each Data Point. First value is the Data Point's Class, specified by either its label or its sequential number in the CLASSES list.
            Next are given d Data Point's values.
          • Whitespaces between values and empty lines are ignored.

          Output File format:
          • Output File is an ASCII file in CSV (comma separated values) format.
          • For each of data set in the input file, there will be a corresponding result set precedded by the keyword SOLUTION in the first line.
          • For each pair of Classes in the data set in the input file, there will be a corresponding boundary specification, precedded by the line with keyword
            SEPARATION, followed by the labels of those two Classes.
          • If the boundary solution was found, the second line will contain its specification. It will be, based on the boundary type:
            • Linear boundary:   Keyword HYPERPLANE, folowed by the coordinates (d values) of hyperplane's of normal vector w, followed by the hyperplane's free value b.
          • The third line will contain two key-value pairs specifying the values for parameters C (key C) and ε (key E values).
          • If the solution was not found, The second line will say "BOUNDARY NOT FOUND FOR THE FOLLOWING C AND E"
            followed by all pairs of C and ε values used in calculations, each on a separate line.
          Please select the file to upload: