Category Archives: Advanced Methods

MEG-AM11-2

  • Favorite aspect of the session (if any)

It was simple to use other algorithm with looking at material from last week. Also making the code together was helpful, because we made it together and slow, good for understand.

 

This is the code:

setwd("E:/SS2014/AM/W11/Plot_Points_BD_RS (1)")

library(randomForest)

#From old session
vegdata <- readOGR("Plot_Points_BD_RS.shp", layer="Plot_Points_BD_RS")
vegdata <- data.frame(vegdata[,c(63,65:66, 157:178)])
vegdata <- vegdata[,-c((ncol(vegdata)-1):ncol(vegdata))]

vd <- vegdata[,c(1, 4:ncol(vegdata))]
vd$Ntzng_nht <- factor(vd$Ntzng_nht)

#Using the randomForest-Package
vd.rf<-randomForest(Ntzng_nht ~ .,data=vd)
vd.imp<-round(importance(vd.rf), 2)

#Print and plot of result of randomForest
print(vd.rf)
plot(vd.rf,main="Error per tree of randomForest")

#Print and plot of importance of Variables in rF
print(vd.imp)
plot(vd.imp, col="white",main="Importance of variables",ylab="Importance",xlab="Variable")
text(vd.imp,cex= 0.7)#, pos=1)

The main output of randomForest:

OOB (out-of-bag-estimate of error) is 41.46% in the run the figure was taken from. Confusion matrix shows that “Acker” is best described and only little confusion with other landuse. Of course “Sonstiges” is always “Sonstiges”. “Streuobstwiese” is often confused with “Wiese”. “Wiese” itself is confused with “Streuobstwiese”. This means the algorithm can not differentiate the too very well.

importance

This is the importance of variables in randomForest as scatterplot and print:

The importance of the variables is different. There are very important variables, some are not so important. The most important are m14, m15, m16, m20 and m21. The least important are m4, m11, m12, m13 and m17. Probably the result of the decision algorithm would be the same without this variables.

Importance of variables

 

This is the list of the importances of the variables.

 

importance

This is the plot of the randomForest function result. I do not know how to interpret,

 

 

Error per tree of randomForest

MEG-AM L11

Ich habe die Grundstrucktur der Klassifizierungsbäume in R gelernt. Welche möglichkeiten der Klassenerstellung es gibt und zu was für Unterschiedlichen Entscheidungen es dabei kommen kann..

.RandomForestMatrixConfusionMatrixRandForestBänder

 

Der Out of Bg error liegt bei 43.9%. Mein Acker hat einen Fehler von 0.0, kann das sein? Wald und wiese haben auch noch einen kleinen Fehler. Das bedeutet das sich diese von den anderen am besten differnzieren lassen. Bei Wiese finde ich das allerdings merwürdig, besonders im bezug auf Streuobstwiesen. Der Fehler ist wohl als relativ aufzufassen. Es wurden allerdings Streuobstwiesen als WiesenPixel und umgekehrt erkannt…

VegDel.rforest <- randomForest(Ntzng_nht ~ ., data = VegDel,
controls = ctree_control(minsplit = 2, minbucket = 1,
mincriterion = 0.5))
plot(VegDel.rforest)
#data.rforest <- randomForest(VegDel$Ntzng_nht ~ VegDel$m_1 + dataset2$m_2 dataset2$m_3 + dataset2$m_4 + dataset2$m_1 + dataset2$predictorF + dataset2$predictorG + dataset2$predictorH + dataset2$predictorI, data=dataset2, ntree=100, keep.forest=FALSE, importance=TRUE)
plot(VegDel.rforest, main = "RandomForest")
VegDel.import<-round(importance(VegDel.rforest), 2)
text(VegDel.rforest, use.n=TRUE, all=TRUE, cex = 0.75)
print(VegDel.rforest)
print(VegDel.import)

plot(VegDel.import, col="red",main="importance",ylab="Importa",xlab="e")

text(VegDel.import,cex= 0.75)