Introduction
The Python library LIBiFBTSVM
was part a partial requirement for obtaining my Master’s degree in Systems Engineering.
It’s available here: https://github.com/kritchie/LIBiFBTSVM
The library
The project includes a Python re-implementation of the original iFBTSVM algorithm, aiming to make it accessible to the data science community.
This re-implementation focuses on efficient learning of classifications by utilizing multiple processors available in most computers. It also integrates with the Scikit-Learn library API to establish entry and exit points for the library.
When to use it ?
This library becomes very useful when you are trying to use an SVM to classify a huge dataset that doesn’t fit in memory.
The problem is that fundamentally the SVM is an optimisation problem that requires seeing the entire data set before choosing which vectors are the best candidates for defining the largest margin between the classes.
Usage Example
The intention was to design the library as a drop-in replacement to other SVM implementations in scikit-learn
.
It’s built using the “class iFBTSVM(BaseEstimator)
” inheritance relationship, therefore it has the same API as other
BaseEstimator
from scikit-learn
.
The following is a toy example of how to initialise the classifier and use it:
import time
from sklearn.datasets import load_iris
from libifbtsvm import iFBTSVM
from libifbtsvm.models.ifbtsvm import Hyperparameters
if __name__ == '__main__':
dataset = load_iris()
params = Hyperparameters(
epsilon=0.0000001,
fuzzy=0.01,
C1=8,
C2=2,
C3=8,
C4=2,
max_iter=500,
phi=0.00001,
kernel=None,
)
# Initialisation iFBTSVM
ifbtsvm = iFBTSVM(parameters=params, n_jobs=1)
# Training
before = time.perf_counter()
ifbtsvm.fit(X=dataset.data, y=dataset.target)
after = time.perf_counter()
elapsed = (after - before)
# Prediction
accuracy = ifbtsvm.score(X=dataset.data, y=dataset.target)
print(f'Accuracy iFBTSVM: {accuracy * 100.0}% Train time: {elapsed}s')
The above code should output the following:
Accuracy iFBTSVM: 97.33333333333334% Train time: 0.5585242840024875s
Feel free to open an issue in the Github repo directly if you have a question of if you find a bug.