Reverse engineering of 8086, from a calculator to the most used processor

If you have a laptop or desktop computer, you probably use a 8086-based CPU, or one implementation of x86 family to be exact. For example, I have a Lenovo laptop with a Core i5 CPU, which is based on x86 architecture. In this article, I want to talk about x86 architecture, and to explain how it works, I just start with the simplest one : 8086.

8086, is probably the first general-purpose processor made by Intel. This is why it’s famous, and in a lot of cases, people prefer to use it or study it. Everything is well-documented and also there are billions of tutorials and examples on how to use it! For example, if you search for “Interfacing Circuits”, you will find a lot of 8086-based computers made by people, connected to interface devices such as monitors, keyboards or mice.

Before we start reverse engineering, and make our simple x86-compatible computer, let’s take a look on the machine code structure of 8086. In this case, we just review Register addressing mode, because this mode is easier to understand or re-implement.

In this case, we can only take a look on a two-byte (or 16 bit) instruction code. Our instruction code looks like this :

Byte 1 Byte 0
|Opcode|D|W| |MOD|REG|R/M|

What are these? and why we should learn this? As we want to reverse engineer 8086 architecture and learn how it works, we need to know how this processor can understand programs! So, let’s check what are those fields in these bytes :

  • Opcode : a 6-bit number, which determines about operation (for example ADD, SUB, MOV, etc. )
  • D : Determines source or destination operand. To make reverse engineering process simple, we consider that as constant 1. So, REG field in byte 0 is always destination.
  • W : Determines data size, and like D, to make reverse engineering process simple, we consider it as a constant 1. So, we only can do operations on 16 bit numbers.
  • MOD : Determines mode, as we decided before, we only model the register addressing mode, so we need to consider mode as constant 11.
  • REG and R/M : REG shows us source, R/M shows us destination. Please pay attention, we made this special case because we are going to model register addressing mode. For other modes, we can’t consider R/M as destination.

Now, we learned how 8086 can understand programs, for now, we have some instruction code like this :

Opcode D W MOD REG R/M
xxxxxx 1 1 11 xxx xxx

Let’s assign codes to our registers. As we decided to simplify our reverse engineering process, and also we decided to use only 16 bit registers, I prefer to model four main registers, AX, BX, CX and DX. This table shows us codes :

Code Register
000 AX
001 CX
010 DX
011 BX

As you can see, now we are able to convert instructions to machine code. To make reverse engineering process even simpler, we can ignore “MOV”, but I prefer to include MOV in my list. So, let’s find opcodes for MOV, ADD and SUB.

Opcode Operation
100010 MOV
000001 ADD
010101 SUB

Now, we can convert every 8086 assembly instruction using the format we have. For example this piece of code :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
MOV AX, CX
MOV BX, DX
ADD AX, BX
MOV AX, CX MOV BX, DX ADD AX, BX
MOV AX, CX 
MOV BX, DX
ADD AX, BX

Now, if we want to convert this piece of code to machine code, we have to use the tables we made. Now, I can tell you these codes will be :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
1000101111000001
=> 0x8bc1
1000101111011010
=> 0x8bda
0000011111000011
=> 0x07c3
1000101111000001 => 0x8bc1 1000101111011010 => 0x8bda 0000011111000011 => 0x07c3
1000101111000001
=> 0x8bc1

1000101111011010
=> 0x8bda

0000011111000011
=> 0x07c3

Now, we can model our very simple x86-based computer. But a note to mention, this has a lot of bugs! For example, we can’t initialize registers, so we need to study and implement immediate addressing mode, also, we can’t read anything from memory (x86 is not a load/store architecture, and it lets people work with data stored in memory directly!). But I think, this can be helpful if you want to study this popular processor or do any projects with it.

How to make a computer program

This is a cliché in IT and computer related blogs. You can find at least one topic on How to make a computer program in every blog written by a computer expert (scientist, engineer or experimental expert). So, I also decided to write about it. In this topic, I’m going to explain how your idea can be a program.

I’m not a startup or business person, and I hate when someone wants to teach other people how to have an idea so I consider you already have an idea, and you want to implement your idea as a computer program. Let’s start!

Choose your target hardware

Unfortunately, a lot of programmers ignore this important step, but if you consider a special hardware to develop and implement your ideas, you will have two points :

  1. You learn a new hardware architecture (and maybe organization)
  2. You help someone who wants a specific application on that hardware.

Sometimes, you realize that writing a calculator is pretty stupid. Of course it is when you write a calculator for Windows or macOS. But, when you write a calculator for Arduino, which can interact with a keypad and displays, it’s not.

Write Algorithm

Actually, algorithm is explaining the way we solve the problems. So, we need to write the steps of our solution and test it. Sometimes, when we write a simple algorithm, it’s not efficient at all, and needs a lot of improvements. Imagine this (This algorithm makes all even numbers lesser than 100:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
while( a < 100){
if(a%2 == 0){
puts(a);
}
a++;
}
while( a < 100){ if(a%2 == 0){ puts(a); } a++; }
while( a < 100){
 if(a%2 == 0){
  puts(a);
 }
 a++;
}

This is a piece of larger code. But wait,  how can we improve that? That if there can make this piece of code slower. But, if we consider a = 0, we can write something like this :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
for(int a = 0; a < 100; a + 2){
puts(a);
}
for(int a = 0; a < 100; a + 2){ puts(a); }
for(int a = 0; a < 100; a + 2){
 puts(a);
}

You know, I wrote a shorter code here. Also, this short code has a better structure of making all even numbers lesser than 100.  But I think both of these codes have the same time complexity, so there is no difference. If we had two nested loops, and the inner loop’s condition had effects on the outer loop’s condition, we had to spend time on calculating time complexity and optimizing it.

Finally, you will realize there are some “classic” algorithms, which are already optimized, and you can just use them, and model your idea with them.

Choose the language

This step is also one of the most important ones. Imagine if you want to program an AVR chip, of course JavaScript is not the best choice. There are tools which allow you to write programs for those chips in JS, but the language is not made for communication with AVR! But, when you want to program a website, specially when you’re dealing with front-end stuff, C is not your best choice! But wait, if the program we want to make is a general purpose desktop program or a school/university project, we are actually free to choose the language!

Imagine we want to write a simple program, which does Addition with bitwise operations. We can write our program in C/C++ like this :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#include <stdio.h>
int bitwiseAdd(int x, int y){
while(y != 0) {
int carry = x & y;
x = x ^ y;
y = carry << 1;
}
return x;
}
int main(){
printf("%d\n", bitwiseAdd(10 , 5));
return 0;
}
#include <stdio.h> int bitwiseAdd(int x, int y){ while(y != 0) { int carry = x & y; x = x ^ y; y = carry << 1; } return x; } int main(){ printf("%d\n", bitwiseAdd(10 , 5)); return 0; }
#include <stdio.h>

int bitwiseAdd(int x, int y){
 while(y != 0) {
  int carry = x & y;
  x = x ^ y;
  y = carry << 1; 
 }
 return x;
}

int main(){
 printf("%d\n", bitwiseAdd(10 , 5));
 return 0;
}

And you can write it in Ruby like this :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def bitwiseAdd(x, y)
while y != 0
carry = x & y
x = x ^ y
y = carry << 1
end
return x
end
puts bitwiseAdd(10, 15)
def bitwiseAdd(x, y) while y != 0 carry = x & y x = x ^ y y = carry << 1 end return x end puts bitwiseAdd(10, 15)
def bitwiseAdd(x, y)
 while y != 0
  carry = x & y
  x = x ^ y
  y = carry << 1
 end
 return x
end

puts bitwiseAdd(10, 15)

But, when you want to directly communicate with hardware, you’ll need a low-level language. C/C++ are actually mid-level languages. They can help you communicate with hardware (like this piece of AVR code) :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
while(1){
PORTC.1 = 0;
delay_ms(1000);
PORTC.1 = 1;
delay_ms(1000);
}
while(1){ PORTC.1 = 0; delay_ms(1000); PORTC.1 = 1; delay_ms(1000); }
while(1){
 PORTC.1 = 0;
 delay_ms(1000);
 PORTC.1 = 1;
 delay_ms(1000);
}

or like that bitwiseAdd(x, y) function , they can help us write normal programs. But Assembly language is a really low-level language. We can use it when we need to talk to our hardware directly.

You see, all programming languages can help us, but depending on the conditions, we can use different languages.

And …?

And now, you probably know how a computer program is made. But, if you really want to become a developer, you have to study about paradigms, methodologies, etc. I tried to keep it simple in this article, but later, I’ll write about those topics more.

A very short introduction to Ambient Music

In my Persian blog, I had written a lot about operating systems, computer architecture and digital electronics. I have plans for English blog and I’ll write about computer science and engineering in future, but, I decided to explain my experiences in music for now. In this article, I’m going to talk about ambient music, and how it’s produced. Of course, this is not a music theory or musical software tutorial.

Let’s talk about ambient music, what is ambient music? Ambient music is a minimalistic, modern and electronic genre, which is invented by “Brian Eno” in early 70’s. Ambient is actually a subgenre of electronic music, but after years of evolution, it’s known as an independent musical genre.

Characteristics

Ambient music, is highly dependent on the environment. You will realize this when you hear the name. Actually, this genre is based on John Cage’s theory, Everything we do is music and his 4’33” is one of the best ambient songs ever! Four minutes and thirty three seconds of silence, the composer asks you to listen to ambient noises. It means, you can record any sounds and then make it ambient music. This is true, but, not every sounds. A lot of ambient tracks are just recordings from nature, and a simple melody is played over that sounds. Some others are electronic productions, based on natural sounds and atmospheres. This means, ambient is some kind of avant-garde music, you are free to do everything you want!

Styles

You know every musical genre, can be played in different styles. In this section, I just explain some of ambient music styles. I’m sure there are more styles, but these styles are my favorites :

  • Dark Ambient :
    This is one of the most known styles of the ambient music. Sometimes people think that dark ambient is a subgenre of ambient music, but it’s actually not. Because it’s the same concept, but with scary, depressing or dark atmosphere. Sounds like someone plays his/her music in an abandoned and haunted place 😀
  • Space Ambient :
    This is another style. If you’re a fan of outer space life, science fictions and movies like Star Wars or Star Trek, this is your kind of music. In this style, musicians use effects which can make you feel aliens are in your home! And this is what makes this style awesome!

There are more, but I usually listen to these styles. So, I can explain these two better. For more information, you can find ambient musicians on YouTube, Soundcloud, Jamendo, etc. And ask them about their style!

Subgenres

And now, we are going to take a look on subgenres of ambient music. These genres are created to show us how minimal music can be perfect!

  • Drone :
    My most favorite subgenre of ambient music, drone music is just sustained chord, note or sound. Also, artists may decorated the sustained sounds using small melodies, or a single melody is repeated on the drone sound. I’ll explain drone music in future.
  • Lowercase :
    This is the most artistic form of ambient music. Artists record sounds from nature, or daily activities, and then, amplify them and edit them to make a melody. Lowercase music is one of the most minimalistic genres, and one of the most amazing ones, too!

In this article, we talked about ambient music and which kind of music we can call ambient. In future, I’ll explain more about making an ambient track and I’ll introduce my favorite ambient artists.

Good luck!

Microcontrollers, Design and Implementation released!

It was about two years I started serious study on computer architecture. In these years, I learned a lot and I could simulate and implement a microprocessor, similar to real ones. In Summer 2016, I decided to share my experience with others. Then, I started writing this book. This book has seventeen chapters, and after reading this book, you will have a concept of computer architecture.

Chapters

  • License – Licensing and Copyrights
  • Introduction – A quick review of the book, defining target audience of the book.
  • Chapter 1 : What’s a microcontroller? – This chapter, defines a microcontroller. After reading this chapter you’ll understand the internal parts of a microcontroller. It’s completely theory, but you need the concepts.
  • Chapter 2 : How to talk to computer? – In this chapter, we have a quick view on programming and then, machine language. We determine the word size of our processor in this chapter.
  • Chapter 3 : Arithmetic Operations – This chapter focuses on arithmetic operations in base 2.
  • Chapter 4 : Logical Operations – This is all about boolean algebra, the very basic introduction to logical circuits.
  • Chapter 5 : Logical Circuits – Our journey starts here, we learn how to make logics using NAND in this chapter, and then, we learn the logic gates.
  • Chapter 6 : Combinational Circuits – This chapter is where you learn how to combine simple logics together and make more complex logics. Actually, you learn how to implement Exclusive OR and Exclusive NOR using other gates.
  • Chapter 7 : The First Computer – In this chapter, we make a simple Addition Machine.
  • Chapter 8 : Memory – In this chapter, we just take a look on sequential circuits.
  • Chapter 9 : Register File – After we learned sequential circuits, we make registers and then, we make our register file.
  • Chapter 10 – Computer Architecture – In this chapter, we’ll learn theory and basics of computer architecture and organization .
  • Chapter 11 – Design, Advanced Addition Machine – In this chapter, we add memory blocks to our addition machine.
  • Chapter 12 – The Computer (Theory) – In this chapter, we decide about what our computer should do. Actually, we design a simple ISA.
  • Chapter 13 – Arithmetic and Logical Unit – Now, it’s time to design our ALU.
  • Chapter 14 – Program Structure – In this chapter we decide about programming and machine language, and we design a simple instruction code.
  • Chapter 15 – Microcontroller – And finally, we add the RAM to our ALU, and we’ll have our simple microcontroller.
  • Chapter 16 – Programming and Operating System – In this chapter, we actually talk about the software layer of computers.
  • Chapter 17 – The Dark Side of The Moon – The final chapter, is all about making real hardware, we take a look at transistors, integrated circuits and HDL’s here.

Link to PDF File : Download

Hexadecimal Decoder (Seven Segment)

You may use a seven segment to show time, temperature, voltage, etc. But, when you want to use it as an interface, specially on a simple computer, usual decoders (e.g. 7447) are not the best choice. You may need a shift-register or you may need to make a specific decoder for your computer display.  If you watch Ben Eater’s video about EEPROM,  You will find an Electrical Erasable ROM can be your decoder. But how? Imagine we have a common-cathode seven segment display. Then, We will have something like this for number zero :

dp g f e d c b a
0 0 1 1 1 1 1 1

So,  if we want to show a hexadecimal number, what will we need? Of course, we need to find which pins of our common cathode should be on for a specific number. In case of common anode displays, pins should be off.  Now, let’s design our Truth Table.

Number dp g f e d c b a
0 0 0 1 1 1 1 1 1
1 0 0 0 0 0 1 1 0
2 0 1 0 1 1 0 1 1
3 0 1 0 0 1 1 1 1
4 0 1 1 0 0 1 1 0
5 0 1 1 0 1 1 0 1
6 0 1 1 1 1 1 0 1
7 0 0 0 0 0 1 1 1
8 0 1 1 1 1 1 1 1
9 0 1 1 0 1 1 1 1
A 0 1 1 1 0 1 1 1
b 0 1 1 1 1 1 0 0
C 0 0 1 1 1 0 0 1
d 0 1 0 1 1 1 1 1
E 0 1 1 1 1 0 0 1
F 0 1 1 1 0 0 0 1

This is our truth table. Fortunately, Logisim can make a circuit from a truth table, but I made this circuit before. It looks like this :

The inputs W, X, Y and Z represent the BCD input. But how can make this thing? if you are interested to make one of these for real, you can use a 2816 chip, which is actually an EEPROM.

Hello world!

Hello World!

This is my first blog post in English. After years of blogging in my mother tongue, Persian, I decided to start writing in English. I think blogging in English is much better, because more people can read what I write, and also more eyes will see my posts and works. I’ll start writing my experiences here, as soon as possible.