Let’s build Metaverse with AI : Introduction

It was 2021, the whole products under the flag of Facebook, went down for a few hours. I remember that most of my friends just started messaging me on Telegram instead of WhatsApp and also no new post or story was uploaded on Instagram.

A few hours passed, everything went back to normal, except one. Zuckerberg made a huge announcement and then told the whole world Facebook will be known as Meta and he also announced the Metaverse as a weird alternate life game where you can pay actual money and get a whole lot of nothing.

I personally liked the idea of metaverse (and at the same time, I was a co-founder of ARMo, an augmented reality startup company) so you may guess, it was basically mu job to follow the trends and news about metaverse and what happens around it.

It’s been a few days I am thinking about metaverse again. Because I have a strong belief about the whole thing becoming a hype again. Specially with this bull run on bitcoin and other currencies. I also concluded that metaverse has a big missing circle, which I’m going to discuss in this post.

A little backstory

Since I started Mann-E, as an AI image generation platform, a lot of people messaged me about connecting the whole thing to the blockchain. Recently, I just moved the whole payment system to cryptocurrencies and I’m happy of what I’ve done, not gonna lie.

But for being on the chain, I had different thoughts in mind and one of them was an ICO, or even an NFT collection. They may seem cool but they also always have the righteous amount of criticism and skepticism as well. I don’t want to be identified as a bad guy in my community of course, so I left those ideas for good.

As you read prior to this paragraph, I have a history in XR (extended reality) business and currently, I have my own AI company. I was thinking about the connection of Metaverse and AI, and opportunities of both!

Before going deep, I have to ask a question…

What did we need to access the metaverse?

In 2021, when it was the hot topic of every tech forum, if you asked Okay then, how can I enter the metaverse? No one could answer correctly. At least in Iranian scene, it was like this.

I did a lot of research and I found these to enter a metaverse of choice:

  • A crypto wallet: Which is not a big deal. Pretty much everyone who’s familiar with tech and these new trends, owns a crypto wallet. They’re everywhere. You can have them as web apps, native apps, browser extensions and even in hardware form. If you want to waste a few hours of your life, you also can build one from scratch.
  • Internet browser: Are you kidding me? We all have it. Currently most of the applications we’ve used to install on our computers turned into SaaS platforms. We need to have a good browser.
  • A bit of crypto: The problem in my opinion starts here. . Most of these projects however had a token built on ETH network (or accepted Ethereum directly) but some of them had their native currencies which were impossible to buy from well-known exchanges and as you guessed, it increased the chance of scam! But in general it was a little odd to be forced to pay to enter the verse without knowing what is happening there. I put an example here for you. Imagine you are in Dubai, and you see a luxurious shopping center. Then you have to pay $100 in order to enter the center and you just do window-shopping and leave the shopping center disappointed. It’s just a loss, isn’t it?

But this is not all of it. A person like me who considers him/herself as a builder needs to explore the builder opportunities as well, right? Now I have a better question and that is…

What we need to build on Metaverse?

In addition to a wallet, a browser and initial funds for entering the metaverse, you also need something else. You need Metaverse Development Skills which are not easy to achieve.

If we talk about programming side of things, most of the stuff can be easily done by using libraries such as ThreeJS or similar ones. If you have development background and access to resources such as ChatGPT, the whole process will not take more than a week to master the new library.

But there was something else which occupied my mind and it was 3D Design Skills which are not easily achievable to anyone and you may spend years to master it.

And this is why I think Metaverse needs AI. And I will explain in the next section.

The role of AI in metaverse

This is my favorite topic. I am utilizing AI since 2021 in different ways. For example, I explained about how I could analyze electrical circuits using AI. Also if you dig deeper in my blog, you may found I even explained my love of YOLOv5 models.

But my first serious Generative AI project was the time GitHub’s copilot becoming a paid product and I was too cheap to pay for it, so I build my own. In that particular project, I have utilized a large language model called BLOOM in order to generate code for me. It was the beginning of my journey in generative artificial intelligence.

A few months after that, I discovered AI image generators. It lead me to the point I could start my own startup with just a simple ten dollars fund. Now, I have bigger steps in mind.

Generative AI for metaverse

There is a good question and that is How can generative artificial intelligence be useful in the metaverse? And I have a list of opportunities here:

  • Tradebots: Since most of metaverse projects offer their own coin or token, we may be able to utilize AI to make some sort of advice or prediction for us. Honestly, this is my least favorite function of AI in the metaverse. I never was a big fan of fintech and similar stuff.
  • Agents: Of course when we’re entering the matrix, sorry, I meant metaverse, we need agents helping us find a good life there. But jokes aside, Agents can help us in different ways such as building, finding resources or how to interact with the surrounding universe as well.
  • Generating the metaverse: And honestly, this is my most favorite topic of all time. We may be able to utilize different models to generate different assets for us just in order to build our metaverse. For this particular one, we need different models. Not only LLMs, but image generators, sound generators, etc.

What’s next?

The next step is doing a study on every resource or model which can be somehow functional or useful in the space. Also we may need to explore possibilities of different blockchains and metaverses in general. But first, the focus must be on AI models. The rest will be made automatically 😁

Analyzing components of an electric circuit with YOLOv5

In past recent weeks, I did a lot with YOLOv5. A few weeks prior to this article, I wrote an article on why I love YOLOv5 and later, I did a project with YOLOv5 which was somehow a try for making something like symbolab or similar software.

I explained that project in details in my Persian blog (link) and I may write an English article on that project soon. But in this article specifically I am going to explain about a newly done project of mine!

Electric Circuit component analysis using YOLOv5

Introduction

After making the math equation OCR I got a few ideas in my head about doing identical projects but in different scopes and areas of my interest. Believe it or not, I am not really the type of person who sticks to only one thing and I tried to many different things in my life. As my job is making computer software and platforms, I have decided to use the knowledge I have in this field to improve my performance in the other fields as well.

I have studied Computer Hardware Engineering in the university and I know a thing or two about electronics. I have never been an electrician or an electronics expert but I have made some cool gear using Arduino, Raspberry Pi and even basic electronic components. I also am a big fan of YouTuber electroboom and like what he does a lot!

So this is the reason I started this project. I decided to make a computer vision program which helps us understand the components in a schematics and in this article, I will explain how I did it.

Who’s the audience of this article?

Since I am not a type of content creator or writer who bombards the audience with complex math and physics (or computer science) concepts, I have to say everyone.

But for being more specific, I have to say that everyone who’s enthusiastic about artificial intelligence, computer vision and electronics and is able to read English is my audience. At least in this particular article. Also if you are a newbie who wants to find their own path in the vast universe of computer science, this article will give you an idea about computer vision projects combined with deep learning.

Nikola Tesla

Previously done works

Although I didn’t want this article to be a thesis/research paper, I had to put this in the article. Honestly, I haven’t search about what people may have done with YOLOv5 (or other tools) to analyze electric or electronic circuitry.

I’m sure there are other minds out there who had thoughts of this and I appreciate their thoughts and also their efforts.

The research procedure

The problem

We have tons of circuit schematics in books or notes which students or enthusiasts can’t understand very well. Unlike math or physics formulas, there is no application or tool to find out what schematic represents what component therefore we need some tool to understand our circuits better.

The possible ways of implementation

  1.  Using OpenCV functions such as contouring and similar stuff to detect which shape is which.
  2. Using a pre-trained model for electrical components.
  3. Developing a CNN or similar network to detect the components.
  4. Fine-tuning YOLOv5 to our need.

Each of these ways, had their own problems. In the following lines, we’ll find out why most of them were inefficient for me.

Using OpenCV functions, although it’s first go-to for most of computer vision programmers but it is really problematic specially when you get pictures which are very close to each other. This is an example of my input data:

Example of input data

and as you can see, I have a battery in series with a capacitor and even to human eyes, these two can be mistaken! And remember, OpenCV doesn’t do magic and it is only a great tool for processing images.

The next way was to Find a pre-trained deep learning model which has the data of the components. It is a nice idea but it also has its own problems. For example, I had no idea which network is used, which libraries are used, etc. Also there is no mediapipe for electric circuitry where you are sure about its functionality in your projects.

Third way was my second favorite by far. Developing our own CNN or identical network for object detection or localization. It is cool, it can be efficient but the amount of work I had to put on it was actually out of my range of tolerance. Specially since I’m not doing these projects for graduation or money, I did not want to put too much effort on my project.

And last but not the least, Fine-tuning YOLOv5 for my needs, was the best solution I could ever think of. YOLOv5 is one of the best tools for quickly implementing your computer vision plus deep learning ideas. It also is a very very accurate and fast tool. So I went with this one.

Data gathering and preparation

YOLOv5 requires a set of labeled images. It means we need to have images of our topic of interest and nothing more.

Nicholas Renotte explains how to get data or images you need in this video. So if you want to do a similar project, I suggest giving that video a watch. But in my case, things were a little different.

I needed tons of schematics and on the other hand, I didn’t really want to spend a very long time labeling and preparing the data. So I have decided to draw a couple of schematics on a piece of A4 paper like this:

Example of my data

and for preparation, I just took photos of these drawings using my phone (Xiaomi Redmi Note 8 Pro) and then moved them to my computer.

For slicing them to small chunks of photos, I just used Adobe Photoshop (I know that might be surprising but I am too lazy to use any other tool) and then saved them in to a folder structure acceptable for YOLOv5.

The next part (which I always call the worst part of an A.I/Data project) was cleaning up the data and then labeling it. I used leabelImg in order to label my images since it has provided a YOLO type of labeling system.

Training YOLOv5

After doing all the hard stuff the time to train our beloved YOLOv5 arrived. Training YOLOv5 is fairly easy! You just have to follow their guide provided in their github repository to train your own version of YOLOv5.

Since the process of training YOLOv5 is easy and well-documented, I don’t really spend so much time explaining the process here. I only point out what I have done in order to get the best results.

I used 416×416 image sizes (if you’re not familiar with YOLOv5, you must know that their training script resizes the images) and a batch size of 32.

At the beginning I used their base weights (which is trained on COCO dataset) called yolov5s which stands for Small YOLOv5 and apparently, it has 7.2 million parameters (according to this table) and it wasn’t really good after almost 200 epochs. So I did reset my training process with yolov5m which stands for Medium YOLOv5 which has 21.2 million parameters.

To be honest, I know the number of parameters isn’t the only thing that matters, but for the love of God, let’s keep things simple.

Finally, with 416×416 images, batch size of 32, 500 epochs and medium model and almost five hours of waiting (since I was doing this process on my Macbook Pro and not in Google Colab), I got my desired results.

The result

The final result

As you can see, I got pretty good confidence levels on my components. Unfortunately, confidence levels for those inductors isn’t fit in the picture so for a better understanding of this resulting photo, I put this table here as well:

Confidence levels and coordinations

Future works

After finishing this project I’ve got a few ideas in my head. The very first thing is to generate a net list for a SPICE software. Imagine if you can draw a circuit on paper (Most of us engineers usually use paper to do our initial designs, right?) and then take a photo of it and boom! you have it in your SPICE software.

The second thing coming to my mind is actually combining this with an OCR software which can understand numbers and units we’ve used in our electrical circuitry. For example understands that 200K besides a resistor, means the resistor has 200 kilo ohms of electrical resistance.

Then, we can apply all these data to some calculator which can help us have a better understanding of our designs and gives us information about the behavior of our circuit in different situations such as changes in current, voltage or frequency.

Conclusion

In conclusion, I believe every kind of OCR can be helpful in our lives. I remember when I was a child there was some sort of pen-like device which could read verses of Quran and I liked the whole idea.

Later when I got older I decided to find out how that magical pen works and can we improve that? Yes Quran is very important for Muslim people and there is no doubt of that but that wasn’t enough in my opinion since that device could be used by visually impaired people. They could use that pen to understand Quran and other types of texts as well.

And now, I have the knowledge to make the world a better place to use the technology to people’s advantage. After making a real-time sign language translation program with A.I, I have decided to just conquer another realms of computer vision as well.

Lastly I have to say there is a very vast world of the unknown we can easily uncover using our knowledge and I try my best to do that.

Regards.

Why I love YOLOv5?

I am a big fan of Nicholas Renotte’s channel on YouTube. I also love computer vision and its combination with deep learning. A few months ago, Nicholas posted this video, which is about YOLOv5. I usually am too lazy to watch videos which are longer than 15 minutes and I watch them in a few episodes. But this video made me sit behind the laptop screen for over an hour and I’m sure I won’t regret it.

So let’s start the article and see where this story begins. As I mentioned earlier, I love computer vision specially when it’s combined with deep learning. I believe it can help us solve very complex problems of our projects with ease. My journey in world of these YOLO models have started almost a year ago, when I wanted to develop a simple object detection for detecting street signs.

Firstly, I found a lot of tutorials on darknet based training but l did not manage to get it to the work, specially since I have a mac, it could be a very realistic nightmare. So I guess YOLOv5 was a miracle. In this article, I am going to explain why I love YOLOv5 and why I prefer it to other YOLO versions.

What is YOLOv5?

According to their github repository, YOLOv5 is a family of deep learning models which is essentially trained on Microsoft’s COCO dataset. This makes it a very very general-purpose object detection tool which is fine for basic research and fun projects.

But I also needed to have my own models because I wanted to develop some domain-specific object detection software. So I realized they also provide a python script which helps you fine-tune and train your own version of YOLOv5.

So I basically fell in love with this new thing I have discovered. In the next sections, I will explain why I love YOLOv5!

Why I love YOLOv5?

Firstly, I invite you to see this chart, which shows the comparison of YOLOv5 with other commonly used object detection models:

And since there’s been a controversy about YOLOv5 claims about training time, inference time, model storage size, etc. I highly recommend you read this article on Roboflow’s blog.

So we can conclude the very first thing which made me happy is the speed and that’s right. The second thing by the way is the fact I am lazy. Yes, I am lazy and I know it.

I always tried to compile darknet and use it for having a YOLOv4 model and make my projects on top of YOLOv4 but when I saw how hard it can get and since I have a mac and I didn’t really want to fire-up an old computer for these projects, I was looking for something which does everything with a bunch of python scripts.

Since I discovered the YOLOv5, I started working with it and the very first project I have done was this pedestrian detection for a self-driving car.

Then, I started doing a lot of research and asking about what I can do with YOLOv5. I find out I can do pretty much anything I want with ease as they provided a lot of stuff themselves. Isn’t that good enough? Fine. Let me show you another youtube video of mine which I solved my crop problem with their internal functions.

If you’re not convinced yet, I have to tell you there is a great method which is called pandas in this family of models.

As the name tells us, it really outputs a pandas dataframe which you can easily use data from that dataframe. Let me set a better example for you. Considering we want to find out which plants are afflicted and which ones are not in a drone footage.

By using this method, we can simply make an algorithm which counts the amount of afflicted ones in a single frame, so we can easily find out how many afflicted plants we have in a certain area. The whole point here is that we have statistically right data for most of our researches.

The other example would be the same as my pedestrian detection system. We can command the car to get data first from the cameras to make sure we’re dealing with pedestrians and second get data from distance measurement system (which can be an Ultrasonic or LiDAR) to make sure when it should send braking command.

Conclusion

Let’s make a conclusion on the whole article. I love YOLOv5 because it made life easier for me, as a computer vision enthusiast. It provided the tools I wanted and honestly, I am really thankful to Ultralytics for this great opportunity they have provided for us.

In general I always prefer easy-to-use tools and YOLOv5 was this for me. I need to focus on the goal I have instead of making a whole object detection algorithm or model from scratch.

I finally can conclude that having a fast, easy-to-use and all-python tool for object detection was what I was always seeking and YOLOv5 was my answer.

I am glad to have you as a reader on my blog and I have to say thank you for the time you’ve spent on my blog reading this article. Stay safe!