Favorite aspect of the session (if any)
It was simple to use other algorithm with looking at material from last week. Also making the code together was helpful, because we made it together and slow, good for understand.
This is the code:
#From old session
vegdata <- readOGR("Plot_Points_BD_RS.shp", layer="Plot_Points_BD_RS")
vegdata <- data.frame(vegdata[,c(63,65:66, 157:178)])
vegdata <- vegdata[,-c((ncol(vegdata)-1):ncol(vegdata))]
vd <- vegdata[,c(1, 4:ncol(vegdata))]
vd$Ntzng_nht <- factor(vd$Ntzng_nht)
#Using the randomForest-Package
vd.rf<-randomForest(Ntzng_nht ~ .,data=vd)
#Print and plot of result of randomForest
plot(vd.rf,main="Error per tree of randomForest")
#Print and plot of importance of Variables in rF
plot(vd.imp, col="white",main="Importance of variables",ylab="Importance",xlab="Variable")
text(vd.imp,cex= 0.7)#, pos=1)
The main output of randomForest:
OOB (out-of-bag-estimate of error) is 41.46% in the run the figure was taken from. Confusion matrix shows that “Acker” is best described and only little confusion with other landuse. Of course “Sonstiges” is always “Sonstiges”. “Streuobstwiese” is often confused with “Wiese”. “Wiese” itself is confused with “Streuobstwiese”. This means the algorithm can not differentiate the too very well.
This is the importance of variables in randomForest as scatterplot and print:
The importance of the variables is different. There are very important variables, some are not so important. The most important are m14, m15, m16, m20 and m21. The least important are m4, m11, m12, m13 and m17. Probably the result of the decision algorithm would be the same without this variables.
This is the list of the importances of the variables.
This is the plot of the randomForest function result. I do not know how to interpret,