Skip to content

LIBiFBTSVM

Updated: at 12:00 AM

Introduction

The Python library LIBiFBTSVM was part a partial requirement for obtaining my Master’s degree in Systems Engineering.

It’s available here: https://github.com/kritchie/LIBiFBTSVM

The library

The project includes a Python re-implementation of the original iFBTSVM algorithm, aiming to make it accessible to the data science community.

This re-implementation focuses on efficient learning of classifications by utilizing multiple processors available in most computers. It also integrates with the Scikit-Learn library API to establish entry and exit points for the library.

When to use it ?

This library becomes very useful when you are trying to use an SVM to classify a huge dataset that doesn’t fit in memory.

The problem is that fundamentally the SVM is an optimisation problem that requires seeing the entire data set before choosing which vectors are the best candidates for defining the largest margin between the classes.

Usage Example

The intention was to design the library as a drop-in replacement to other SVM implementations in scikit-learn.

It’s built using the “class iFBTSVM(BaseEstimator)” inheritance relationship, therefore it has the same API as other BaseEstimator from scikit-learn.

The following is a toy example of how to initialise the classifier and use it:


import time

from sklearn.datasets import load_iris

from libifbtsvm import iFBTSVM
from libifbtsvm.models.ifbtsvm import Hyperparameters


if __name__ == '__main__':

    dataset = load_iris()
    params = Hyperparameters(
        epsilon=0.0000001,
        fuzzy=0.01,
        C1=8,
        C2=2,
        C3=8,
        C4=2,
        max_iter=500,
        phi=0.00001,
        kernel=None,
    )

    # Initialisation iFBTSVM
    ifbtsvm = iFBTSVM(parameters=params, n_jobs=1)

    # Training
    before = time.perf_counter()
    ifbtsvm.fit(X=dataset.data, y=dataset.target)
    after = time.perf_counter()
    elapsed = (after - before)

    # Prediction
    accuracy = ifbtsvm.score(X=dataset.data, y=dataset.target)
    print(f'Accuracy iFBTSVM: {accuracy * 100.0}% Train time: {elapsed}s')

The above code should output the following:

Accuracy iFBTSVM: 97.33333333333334% Train time: 0.5585242840024875s

Feel free to open an issue in the Github repo directly if you have a question of if you find a bug.