Pular para o conteúdo principal

Detecting objects using JavaFX and Deep Learning

Computer vision meets Deep Learning mainly due the use of Convolutional Neural Networks. We have used it in our blog already for labeling images. Another challenging area of Deep Learning and computer vision is to identify the position of objects and for this we have a great neural network technique (or architecture - you choose the best) called YOLO. The following video will show you the power of YOLO neural networks:




The output of a neural network such as Resnet50 is a vector with probabilities of labels. So you feed it with an image that contains a cat, it will output various float numbers in an array, each position of that array is a label and the value on that position is the possibility of that label. As an example let's think about a neural network that should recognize dogs (0) and cats (1), and you feed it with an image of a cat, it should output an array of two positions and 0% in position 0 (which represents dog) and 100% in position 1 (which represents cat). The different with YOLO is that it will also output bounding box for the objects detected on an image. So if you show it an image with dogs and cats it will show the label for a given detected object and also the position of that object in a bounding box.

From YOLO web site: https://pjreddie.com/darknet/yolo/

The following presentation from Siraj explains more about YOLO and there's also the AndrewNG coursera classes about object detection.


Ok, how do I use it in Java? As you know Eclipse DeepLearning4J is one of the best framework for deep learning with Java and I found in their git that they are working on a YOLO Zoo model! So we have it already trained out of the box for our use. This is what we are going to use today.

Note: We can also import Keras model - but I didn't find a good keras YOLO model that actually worked with the DL4J import API.


YOLO With DL4J


This amazing tool called DL4J has a TinyYOLO (less precise than YOLO, but faster),  model ready for use. Thanks for Samuel Audet on DL4J gitter channel I found great utility tools that quickly helped me to get the YOLO output information without having to maintain my (at that moment) cumbersome code.

The TinyYOLO model will likely be in DL4J 0.9.2. At the time of this writing it was not yet released, but we can use the SNAPSHOT version!

The model is just like any other. The output is also a INDArray, and you can read it manually or use the great utility method from org.deeplearning4j.nn.layers.objdetect.Yolo2OutputLayer:

INDArray output = yoloModel.outputSingle(img);
Yolo2OutputLayer outputLayer = (Yolo2OutputLayer) yoloModel.getOutputLayer(0);
outputLayer.getPredictedObjects(output, threshold);

The object returned by getPredictedObjects is a DetectedObject which contains all the information about the position of the detected object. The position is relative to the box that it is located (YOLO divides the image and X, Y squares). We must calculate the position in the image. I took this code againt from saudet: (and modified it):

for (DetectedObject obj : predictedObjects) {
String cl = yoloModel.getModelClasses()[obj.getPredictedClass()];
double[] xy1 = obj.getTopLeftXY();
double[] xy2 = obj.getBottomRightXY();
int x1 = (int) Math.round(w * xy1[0] / gridW);
int y1 = (int) Math.round(h * xy1[1] / gridH);
int x2 = (int) Math.round(w * xy2[0] / gridW);
int y2 = (int) Math.round(h * xy2[1] / gridH);
int rectW = x2 - x1;
int rectH = y2 - y1;
ctx.setStroke(colors.get(cl));
ctx.strokeRect(x1, y1, rectW, rectH);
ctx.strokeText(cl, x1 + (rectW / 2), y1 - 2);
ctx.setFill(Color.WHITE);
ctx.fillText(cl, x1 + (rectW / 2), y1 - 2);
}

This is our simply application that allow you to play with TinyYOLO model or any other YOLO model, which means you can train your own YOLO model and use the app to analyse any image you want. Make sure to set the following system properties, otherwise the default TinyYOLO will be used

model.path: The full path to the model file in your disk
model.classes: the comma separated classes, for example: person,bike,car
model.input.info: The input info in a format width,height,channels, for example: 416,416,3
model.grid: The grids in format w,h, for example: 13,13.

Our application also allow us to set the threshold to remove objects which confidence is too low. With a low threshold we have many boxes, with a high threshold a many predictions are discarded:

Adjusted the threshold and it detected a few objects with precision

Smallest threshold and what we have is a mess!

High threshold and we lose some detections

The application allow zoom and scrolling using a code I found in a JavaFX forum. I just made a few changes and it works really well!

Zoom in the image

That's just a small PoC with the YOLO model. There are improvements. I was initially planning to bring a video with object detection in real time, but the model is too slow on my machine (1 frame per second). If I have a machine with GPU I will make a test and post to this blog again.

In a new version we could remove redundant boxes!

Finally this is it, guys. DL4J developers are doing such amazing job bringing deep learning for Java. I think DL4J will be one of the most important library for Java once the market start to understand how deep learning will change their business!


Find the application code in my github.




Comentários

Postagens mais visitadas deste blog

Dancing lights with Arduino - The idea

I have been having fun with Arduino these days! In this article I am going to show how did I use an electret mic with Arduino to create a Dancing Lights circuit. Dancing Lights   I used to be an eletronician before starting the IT college. I had my own electronics maintenance office to fix television, radios, etc. In my free time I used to create electronic projects to sell and I made a few "reais" selling a version of Dancing lights, but it was too limited: it simply animated lamps using a relay in the output of a 4017 CMOS IC. The circuit was a decimal counter  controlled by a 555. 4017 decimal counter. Source in the image When I met Arduino a few years ago, I was skeptical because I said: I can do this with IC, why should I use a microcontroller. I thought that Arduino was for kids. But now my pride is gone and I am having a lot of fun with Arduino :-) The implementation of Dancing Lights with Arduino uses an electret mic to capture the sound and light leds...

Simplest JavaFX ComboBox autocomplete

Based on this Brazilian community post , I've created a sample Combobox auto complete. What it basically does is: When user type with the combobox selected, it will work on a temporary string to store the typed text; Each key typed leads to the combobox to be showed and updated If backspace is type, we update the filter Each key typed shows the combo box items, when the combobox is hidden, the filter is cleaned and the tooltip is hidden:   The class code and a sample application is below. I also added the source to my personal github , sent me PR to improve it and there are a lot of things to improve, like space and accents support.

Genetic algorithms with Java

One of the most fascinating topics in computer science world is Artificial Intelligence . A subset of Artificial intelligence are the algorithms that were created inspired in the nature. In this group, we have Genetic Algorithms  (GA). Genetic Algorithms  To find out more about this topic I recommend the following MIT lecture and the Nature of Code book and videos created by Daniel Shiffman. Genetic Algorithms using Java After I remembered the basics about it, I wanted to practice, so I tried my own implementation, but I would have to write a lot of code to do what certainly others already did. So I started looking for Genetic Algorithm libraries and found Jenetics , which is a modern library that uses Java 8 concepts and APIs, and there's also JGAP . I decided to use Jenetics because the User Guide was so clear and it has no other dependency, but Java 8. The only thing I missed for Jenetics are more small examples like the ones I will show i...