In past recent weeks, I did a lot with YOLOv5. A few weeks prior to this article, I wrote an article on why I love YOLOv5 and later, I did a project with YOLOv5 which was somehow a try for making something like symbolab or similar software.
I explained that project in details in my Persian blog (link) and I may write an English article on that project soon. But in this article specifically I am going to explain about a newly done project of mine!
Introduction
After making the math equation OCR I got a few ideas in my head about doing identical projects but in different scopes and areas of my interest. Believe it or not, I am not really the type of person who sticks to only one thing and I tried to many different things in my life. As my job is making computer software and platforms, I have decided to use the knowledge I have in this field to improve my performance in the other fields as well.
I have studied Computer Hardware Engineering in the university and I know a thing or two about electronics. I have never been an electrician or an electronics expert but I have made some cool gear using Arduino, Raspberry Pi and even basic electronic components. I also am a big fan of YouTuber electroboom and like what he does a lot!
So this is the reason I started this project. I decided to make a computer vision program which helps us understand the components in a schematics and in this article, I will explain how I did it.
Who’s the audience of this article?
Since I am not a type of content creator or writer who bombards the audience with complex math and physics (or computer science) concepts, I have to say everyone.
But for being more specific, I have to say that everyone who’s enthusiastic about artificial intelligence, computer vision and electronics and is able to read English is my audience. At least in this particular article. Also if you are a newbie who wants to find their own path in the vast universe of computer science, this article will give you an idea about computer vision projects combined with deep learning.
Previously done works
Although I didn’t want this article to be a thesis/research paper, I had to put this in the article. Honestly, I haven’t search about what people may have done with YOLOv5 (or other tools) to analyze electric or electronic circuitry.
I’m sure there are other minds out there who had thoughts of this and I appreciate their thoughts and also their efforts.
The research procedure
The problem
We have tons of circuit schematics in books or notes which students or enthusiasts can’t understand very well. Unlike math or physics formulas, there is no application or tool to find out what schematic represents what component therefore we need some tool to understand our circuits better.
The possible ways of implementation
- Using OpenCV functions such as contouring and similar stuff to detect which shape is which.
- Using a pre-trained model for electrical components.
- Developing a CNN or similar network to detect the components.
- Fine-tuning YOLOv5 to our need.
Each of these ways, had their own problems. In the following lines, we’ll find out why most of them were inefficient for me.
Using OpenCV functions, although it’s first go-to for most of computer vision programmers but it is really problematic specially when you get pictures which are very close to each other. This is an example of my input data:
and as you can see, I have a battery in series with a capacitor and even to human eyes, these two can be mistaken! And remember, OpenCV doesn’t do magic and it is only a great tool for processing images.
The next way was to Find a pre-trained deep learning model which has the data of the components. It is a nice idea but it also has its own problems. For example, I had no idea which network is used, which libraries are used, etc. Also there is no mediapipe for electric circuitry where you are sure about its functionality in your projects.
Third way was my second favorite by far. Developing our own CNN or identical network for object detection or localization. It is cool, it can be efficient but the amount of work I had to put on it was actually out of my range of tolerance. Specially since I’m not doing these projects for graduation or money, I did not want to put too much effort on my project.
And last but not the least, Fine-tuning YOLOv5 for my needs, was the best solution I could ever think of. YOLOv5 is one of the best tools for quickly implementing your computer vision plus deep learning ideas. It also is a very very accurate and fast tool. So I went with this one.
Data gathering and preparation
YOLOv5 requires a set of labeled images. It means we need to have images of our topic of interest and nothing more.
Nicholas Renotte explains how to get data or images you need in this video. So if you want to do a similar project, I suggest giving that video a watch. But in my case, things were a little different.
I needed tons of schematics and on the other hand, I didn’t really want to spend a very long time labeling and preparing the data. So I have decided to draw a couple of schematics on a piece of A4 paper like this:
and for preparation, I just took photos of these drawings using my phone (Xiaomi Redmi Note 8 Pro) and then moved them to my computer.
For slicing them to small chunks of photos, I just used Adobe Photoshop (I know that might be surprising but I am too lazy to use any other tool) and then saved them in to a folder structure acceptable for YOLOv5.
The next part (which I always call the worst part of an A.I/Data project) was cleaning up the data and then labeling it. I used leabelImg in order to label my images since it has provided a YOLO type of labeling system.
Training YOLOv5
After doing all the hard stuff the time to train our beloved YOLOv5 arrived. Training YOLOv5 is fairly easy! You just have to follow their guide provided in their github repository to train your own version of YOLOv5.
Since the process of training YOLOv5 is easy and well-documented, I don’t really spend so much time explaining the process here. I only point out what I have done in order to get the best results.
I used 416×416 image sizes (if you’re not familiar with YOLOv5, you must know that their training script resizes the images) and a batch size of 32.
At the beginning I used their base weights (which is trained on COCO dataset) called yolov5s which stands for Small YOLOv5 and apparently, it has 7.2 million parameters (according to this table) and it wasn’t really good after almost 200 epochs. So I did reset my training process with yolov5m which stands for Medium YOLOv5 which has 21.2 million parameters.
To be honest, I know the number of parameters isn’t the only thing that matters, but for the love of God, let’s keep things simple.
Finally, with 416×416 images, batch size of 32, 500 epochs and medium model and almost five hours of waiting (since I was doing this process on my Macbook Pro and not in Google Colab), I got my desired results.
The result
As you can see, I got pretty good confidence levels on my components. Unfortunately, confidence levels for those inductors isn’t fit in the picture so for a better understanding of this resulting photo, I put this table here as well:
Future works
After finishing this project I’ve got a few ideas in my head. The very first thing is to generate a net list for a SPICE software. Imagine if you can draw a circuit on paper (Most of us engineers usually use paper to do our initial designs, right?) and then take a photo of it and boom! you have it in your SPICE software.
The second thing coming to my mind is actually combining this with an OCR software which can understand numbers and units we’ve used in our electrical circuitry. For example understands that 200K besides a resistor, means the resistor has 200 kilo ohms of electrical resistance.
Then, we can apply all these data to some calculator which can help us have a better understanding of our designs and gives us information about the behavior of our circuit in different situations such as changes in current, voltage or frequency.
Conclusion
In conclusion, I believe every kind of OCR can be helpful in our lives. I remember when I was a child there was some sort of pen-like device which could read verses of Quran and I liked the whole idea.
Later when I got older I decided to find out how that magical pen works and can we improve that? Yes Quran is very important for Muslim people and there is no doubt of that but that wasn’t enough in my opinion since that device could be used by visually impaired people. They could use that pen to understand Quran and other types of texts as well.
And now, I have the knowledge to make the world a better place to use the technology to people’s advantage. After making a real-time sign language translation program with A.I, I have decided to just conquer another realms of computer vision as well.
Lastly I have to say there is a very vast world of the unknown we can easily uncover using our knowledge and I try my best to do that.
Regards.