machine learning using Weka library with java

Steps to setup and run the program:

Step 1: Download Weka library

Download page: http://www.cs.waikato.ac.nz/ml/weka/snapshots/weka_snapshots.html

Download stable.XX.zip, unzip the file, add weka.jar to your external library path of Java project in Eclipse.

Step 2: Create a txt file “sample.arff” by following the following format:

@RELATION iris

@ATTRIBUTE sepallength NUMERIC
@ATTRIBUTE sepalwidth NUMERIC
@ATTRIBUTE petallength NUMERIC
@ATTRIBUTE petalwidth NUMERIC
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}

@DATA

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa

Step 3: Training and Testing by Using Weka

This code example use a set of classifiers provided by Weka. It trains model on the given dataset and test by using 10-split cross validation.

Sample program in Java:

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import weka.classifiers.Classifier;
import weka.classifiers.Evaluation;
import weka.classifiers.evaluation.NominalPrediction;
import weka.classifiers.rules.DecisionTable;
import weka.classifiers.rules.PART;
import weka.classifiers.trees.DecisionStump;
import weka.classifiers.trees.J48;
import weka.core.FastVector;
import weka.core.Instances;

public class WekaTest {
public static BufferedReader readDataFile(String filename) {
BufferedReader inputReader = null;

try {
inputReader = new BufferedReader(new FileReader(filename));
} catch (FileNotFoundException ex) {
System.err.println(“File not found: ” + filename);
}

return inputReader;
}

public static Evaluation classify(Classifier model,
Instances trainingSet, Instances testingSet) throws Exception {
Evaluation evaluation = new Evaluation(trainingSet);

model.buildClassifier(trainingSet);
evaluation.evaluateModel(model, testingSet);

return evaluation;
}

public static double calculateAccuracy(FastVector predictions) {
double correct = 0;

for (int i = 0; i < predictions.size(); i++) {
NominalPrediction np = (NominalPrediction) predictions.elementAt(i);
if (np.predicted() == np.actual()) {
correct++;
}
}

return 100 * correct / predictions.size();
}

public static Instances[][] crossValidationSplit(Instances data, int numberOfFolds) {
Instances[][] split = new Instances[2][numberOfFolds];

for (int i = 0; i < numberOfFolds; i++) {
split[0][i] = data.trainCV(numberOfFolds, i);
split[1][i] = data.testCV(numberOfFolds, i);
}

return split;
}

public static void main(String[] args) throws Exception {
BufferedReader datafile = readDataFile(“weather.txt”);

Instances data = new Instances(datafile);
data.setClassIndex(data.numAttributes() – 1);

// Do 10-split cross validation
Instances[][] split = crossValidationSplit(data, 10);

// Separate split into training and testing arrays
Instances[] trainingSplits = split[0];
Instances[] testingSplits = split[1];

// Use a set of classifiers
Classifier[] models = {
new J48(), // a decision tree
new PART(),
new DecisionTable(),//decision table majority classifier
new DecisionStump() //one-level decision tree
};

// Run for each model
for (int j = 0; j < models.length; j++) {

// Collect every group of predictions for current model in a FastVector
FastVector predictions = new FastVector();

// For each training-testing split pair, train and test the classifier
for (int i = 0; i < trainingSplits.length; i++) {
Evaluation validation = classify(models[j], trainingSplits[i], testingSplits[i]);

predictions.appendElements(validation.predictions());

// Uncomment to see the summary for each training-testing pair.
//System.out.println(models[j].toString());
}

// Calculate overall accuracy of current classifier on all splits
double accuracy = calculateAccuracy(predictions);

// Print current classifier’s name and accuracy in a complicated,
// but nice-looking way.
System.out.println(“Accuracy of ” + models[j].getClass().getSimpleName() + “: “
+ String.format(“%.2f%%”, accuracy)
+ “\n———————————“);
}

}
}

 

 

Resolving technical problems:

Solve your technical problems instantly

We provide Remote Technical Support from Monday to Sunday, 7:00PM to 1:00 AM

Mail your problem details at writeulearn@gmail.com along with your mobile numberand we will give you a call for further details. We usually attend your problems within 60 minutes and solve it in maximum 2 days.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.