Help: Assembly Language (MASM .386)

HegemonKhan
I'm currently taking a beginning Assembly Language class, and am struggling with it.

if any of you know MASM 30386 (.386, or .386+, but we're only tested on .386), and wouldn't mind and/or have the time, I could use all the help I can get on it!

If you're interested/available/willing/able to help, leave a post here letting me know.

I struggled with our first program: palindrome (reversing a string), and bloody trying to figure out the IO winAPIs (ReadConsole, WriteConsole) ... grrr !!

(I'll post the code/program that I came up with in a few days to a week later, when I'm not as busy with my school work/tests)

jaynabonne
I'm willing to give it a shot! I used to program the 80x86, decades ago. :)

HegemonKhan
Awesome! Right now, the big jump for me is just learning all of the assembly commands and their syntax/format and etc, getting them to work, lol. And the same with learning all the winAPIs that we'll use more of too. After doing this first actual code-work, I think, I'm understanding much better, the basics of 'mov' (I'm learning the 'mov' mindset, vs the 'if and assignment' mindset of high level languages, I think pretty well now, which is used for doing assembly coding), addresses, registers, and the logic/pattern/sequence/methodology involved. Still a bit shaky with the stack, pushing, and popping function calls and their args/params, and etc. The other jump is assembly/low level coding vs high-level coding, as stuff isn't the same at assembly vs high level, as there's actual ("very useful") run-time operations vs "not as useful" compile-time operations (the high-level like operator symbols: =, EQU, +, -, etc don't work the same as they do in high level languages, lol). Oh, and I really need to learn how to use the debugger and debug... though I do like learning on my own what is wrong or going on with my own program on my own, feels like I'm learning assembly better by doing so, then "cheating" with the Visual basic express/free IDE's debugger.

The MASM we're learning still uses a lot of the 80x86 architecture (processor's have kinda been stuck with intel's original 80x86 architecture, lol), but as far as I learned, the big change is just that (most of) the registers are expanded from 16 bit to 32 bit:

http://www.programming.msjc.edu/asm/Uni ... cture.aspx

but, I still got a lot more to learn...

----------

I won't try to bug you guys~gals too much, but if I can get any support for possibly when I need it, would be great of course. And, I know how busy, (especially) you are Jay, so you, and anyone/everyone else, don't feel any pressure. This thread is just for any possible support, emergencies, or just stuff I'm struggling with, so it's at your convenience or mood, if you want to help me with whatever I may post about, or not.

jaynabonne
When I was writing in assembly, there were no "E*" registers - only ones like AX, BX, SI, etc. And we were in real mode, so we had to address everything relative to the segment registers (and indices were 16-bit, so you had to do some interesting things to address more than 64K at a time - but then machines only had 1MB if you were lucky lol). Of course, it's all extensible, and I can read the disassembly nowadays when I'm debugging higher-level code. So we should be able to get there (or somewhere).

HegemonKhan
My next assignment (time frame/due: Mar. 2) is to write a program that emulates a 16 bit/8086 cpu

I've started and am currently confused on how to set/assign the address size vs the size of the value it holds.

--------
some of the assignment instructions:

6 registers each 8 bits (R0-R5, valued 0-5)
16-bit (1 word) address space
1k of RAM
on reset, cpu begins execution at 0
register operands are 1 byte
all memory addresses are 2 bytes long (words)
------

this is what I've brainstormed so far (not sure if this is how it can be done or not):

(I've done this for the values as can be seen below, but am confused on how to set/assign the address size to be 16 bits/words)

.data:

; (Variables):

; RA: an array/segment that holds the 6 registers (sub segments: R0-R5)

RA byte 6 dup (0)

R0 offset RA
R1 offset RA + 1
R2 offset RA + 2
R3 offset RA + 3
R4 offset RA + 4
R5 offset RA + 5

R0 sizeof type byte
R1 sizeof type byte
R2 sizeof type byte
R3 sizeof type byte
R4 sizeof type byte
R5 sizeof type byte

or, is this completely off, in what I need to do ???

-----

how do I do 16 bit (2 bytes = 1 word) address space and have the values be 1 byte ???

do I set/assign the size of the array for the address size instead of what I did above? Is the address the MSB (most significant byte: bits 8-15) and the value the LSB (least significant byte: 0-7), or is the entire 16 bits used for the address, and then I just assign the size for its values... somehow? I'm confused, grr.

jaynabonne
I'm slightly confused, as you say you want to emulate a 16-bit 8086 cpu, but an 8086 doesn't have Rx registers. So I'll assume that you're just trying to emulate *a* 16-bit cpu of some kind. If I've gotten that wrong, let me know... It's also quite strange to have an 8-bit register size for a 16-bit processor. :?

Keep in mind through all this that I'm not actually trying this out, so I could get things wrong...

(It's been a while since I dealt with MASM directives, so I found this page: http://www.oopweb.com/Assembly/Document ... H08-5.html)

Given the basic things you said above, you're going to have 6 general purpose registers (R0-R5). What you might be able to do is:

RA equ this byte
R0 byte 0
R1 byte 0
R2 byte 0
R3 byte 0
R4 byte 0
R5 byte 0


That would give RA an address of the start of your registers but still have them individually accessible. (That's assuming you need to be able to do array indexing, which you might depending on how your instructions refer to registers.)

(You could probably also go the way you were going with defining the array and then 'equ'ing offsets into it, but I'm not sure of the syntax for that.)

Your simulated memory will be a byte array of size 1K:

memory byte 1024 dup(0)


You're also going to need an instruction pointer. This is where the 16-bit address space first comes in, as your IP will be a 16-bit register which holds the next instruction address.

rip word 0


The rip will be an offset into your memory array. So it will begin reading from the start of your memory array on reset, assuming you have some "reset" code that sets rip to 0. (An alternative: dedicate a register like ESI or EDI to be your instruction pointer. It's handy to use one of those since you're going to need to use it to reference things indirectly.)

I don't know if you also need some sort of stack pointer, to handle . If so, it will be 16 bits as well. (You can't address more than 256 bytes with an 8-bit value.)

So, on reset, you'll set the rip register to 0, and then you're going to enter a loop where you process instructions. This is the part that I'm still unclear on - you need to have your instruction set defined. Since it's an 16-bit processor, your opcodes will probably be 16-bit? Immediate values shoved into registers will be 8 bits only. (You can't put a 16-bit value into an 8-bit register.) So an instruction might be something like:

mov r0, 0

Depending on how the instruction set is defined, that might be a single byte for the "mov" and then bytes following for the register and then the value:

<some 16-bit opcode for mov + the register> <an 8-bit immediate value to assign>

That would be three bytes long.

Is the address the MSB (most significant byte: bits 8-15) and the value the LSB (least significant byte: 0-7), or is the entire 16 bits used for the address, and then I just assign the size for its values... somehow? I'm confused, grr.


To be honest, I'm not sure what you're asking here. You seem to be getting down to how to encode things, but that's where your instruction set comes in. Don't get confused by the fact that you have 16- or 32-bit registers. Just use the parts that you need. If you had your IP in esi, you could load the current value from there (I think!) by using something like:

mov al, byte ptr memory[esi]


(It's possible you might have to use ebx for that.)

I'll stop at this point and see if I've gone off in the weed as far as your questions go. :)

HegemonKhan
bit: smallest unit on binary computer, each digit in binary is a bit

nibble: 4 bits (same as a hexidecimal digit), (2 nibbles = 1 byte)
byte: 8 bits (2 nibbles/hexidecimal digits), smallest addressable unit on most computers

word: 16 bits (2 bytes)
dword (double word): 32 bits (4 bytes)
qword (quad word): 64 bits (8 bytes)

---------------------------

for the most part (at least what we're learning), there's really no difference between the 32 bit that we're learning and the old 8086 (80x86) that you are used to Jay, we're mostly learning how to do the same stuff you'd do on/with the 8086, with the only difference is in that we use 32 bit registers (except the segment registers, they stay 16 bit):

80386 (.386):

eax = (16 bit extended ax) = 32 bits (31 bit to 0 bit)

eax:

ax = lower/lsb of eax, 16 bits (15 bit to 0 bit)
N/A = we can't refer/index into the higher/msb of eax, 16 bits (bit 31 to bit 16)

ax:

ah = higher/msb of ax, 8 bits (bit 15 to bit 8)
al = lower/lsb of ax, 8 bits (bit 7 to bit 0)

VS

8086 (80x86):

ax = 16 bits (15 bit to 0 bit)

ax:

ah = higher/msb of ax, 8 bits (bit 15 to bit 8)
al = lower/lsb of ax, 8 bits (bit 7 to bit 0)

-----------------

so you can just explain to me or help me in term of the 8086, as I don't think there'll be much difference or confusion on my part, due to not really doing anything different in the 32 bit cpu system of registers of 80386. You don't need to try to learn the 80386 as it really seems to behave and/or be designed in the same way as the 8086 design/build


---------------------------------------

As best as I understand, we're suppose to mimic the 8086 cpu's registers, through using variables instead of the actual registers: make variables that act like the registers. I'm not sure if we can use the actual registers (ax, bx, cx, dx, ip, sp, etc) at all or not. I presume we'd be using the stack (pushing and popping) in-place of using the registers, to deal with the transitional step for doing 'mov', as we can't do mov mem (register variable) to mem (register variable) directly (mov R1, R0 --- ERROR, mem to mem isn't allowed).

-----------

The 'R0-R5' are just names of the variables (that are to act like the actual registers), for an example:

(I presume the 'R' was merely chosen to stand for 'register', lol)

ax -> is replaced by -> variable: R0
bx -> is replaced by -> variable: R1
cx -> is replaced by -> variable: R2
dx -> is replaced by -> variable: R3
sp/bp/di/si/ip/st~flag -> is replaced by -> variable: R4
sp/bp/di/si/ip/st~flag -> is replaced by -> variable: R5

(I haven't learned what the 'status' register is in 8086... I presume this 'status' register is generally the same as the 'eflags' register of 32 bit CPUs/80360+)

(eflags register, 15 bit to 0 bit: undefined/reserved, undefined/reserved, undefined/reserved, undefined/reserved, overflow, direction, interrupt, trap, sign, zero, undefined/reserved, auxilary, undefined/reserved, zero, undefined/reserved, carry)

--------------

I came up with creating the 'RA (Register Array: an array variable whose subsections are the 6 register variables)', can this do done? (see below too)

I assume that the 1k (2^10 = 1024) is the size of the entire mimic memory segment you got to work with, named as 'program buffer' (this is what we're using for the data file read into our program), and since the addresses are 16 bit, it's: 2^16 = 65536 sub-segments, but I'm confused by this...

so, if I were/able to use my 'RA', it would just be a sub-segment within the 1k mimic memory segment? I presume for the instruction operations, I'd be using the initial offset address of my 'RA' (which I presume i placed somewhere within the 1k mimic memory segment), and then indexing/scaling over to the specific register segment of it?

---------

I'm still trying to digest/understand your post's contents with trying to match up and just understand my assignment... laughs.

we're given very limited instructions for our assignment/program, just that it has to emulate a cpu, using the given information about the fake cpu (see my previous post), and a given instruction set, an example of one of them:

mnemonic ~ opcode (hexidecmal) ~ operand 1 ~ operand 2

ADD ~ 11h ~ reg1 ~ reg2

I presume the 'regX' are our register-mimic variables: R0-R5

I think from your post, I understand a little bit into how to code in the instructions using the mimic-registers (variables: R0-R5)

let me see if I can figure out how to do this stuff on my own... I think I can... if not I'll post here letting you know I need help, lol.

------

THANK YOU VERY MUCH FOR YOUR HELP, JAY!

your post has been very helpfully, I'm working working on trying to understand it and in-relation-to my (attempt in undertanding the) assignment, lol.

I just need (at least for now) some understanding/guidance on where/how to begin, as I'm a bit lost just being thrown this assignment so quickly while still new to assembly, with so little instructional help.

Don't try to help me in too much detail or beyond what I ask, if you can, as I need to do this myself, my questions are just to hopefully steer me in the right direction or approach in trying to do this assignment. I just want that little nudge/push/hint, so I can hopefully figure out how to do it on my own (besides the needed little push/nudge/hint). This is why my posts might be a bit vague and/or intentionally leaving out some content. I just want nudges, not help with doing the entire assignment. It's my assignment to finish, but I do need a little hintful helping as I'm a bit lost at how to get started and go about the various aspects of it.

XanMag
Just google it. All the kids are doing it nowadays... :|

HegemonKhan
I really suck at searching the net, sighs. I can't find anything on this at all from googling. Also, there's so much assembly online... it's hard to know what is authentic/official and/or matching to the assembly you're using.

jaynabonne
It must have been late last night when I wrote that, but when I got up this morning, it seemed clearer to me. And what you said in your post makes me think that you're slightly confused the way I was as well. Maybe we can gain clarity together. :)

[Also, I understand completely what you mean about not going beyond what you ask (and I admire that). I'll try to restrain myself... ]

You had said you need to emulate an 8086 CPU, but what it looks like you're actually doing for your assignment is emulating an arbitrary 8-bit processor *using* an 8086 emulator to write your code. Does that make sense? In other words, you're writing the code in 8086 assembler (and probably running it on an 8086 emulator). but what you're trying to emulate *through your code* is this imaginary 8-bit processor. So the register set and all that will have nothing to do with the 8086.

It's as if you're an engineer and your company wants to make a new 8-bit processor. But you want to be sure it will work reasonably, so your boss says "Go write some software that emulates this." It could be written in C. It could be written in Java or Javascript or Pascal or anything. For this assignment, you happen to have to write it in 8086 assembly. It might help, actually, if we discuss things as if you it were in a language besides 8086 assembly. That might help to make clear where the dividing line is.

Imagine you were going to write this in Quest. :) You would have an object called "Processor", with attributes for each of the registers. Then you'd have some sort of array or list that corresponds to the memory the processor acts on. Quest would make it difficult, since you can't modify arrays very easily. But maybe that image helps?

To some of your points above.

Even though you're going to be storing R0-R5 (and all your other registers perhaps) in your *emulator's* memory (they have to live somewhere), in the virtual space of your 8-bit processor, they won't be in *its* memory, just as the 8086's AX, BX, etc registers aren't in actual memory - they're in the processor. You are basically building *in code* what would normally be done in hardware within the processor chip. There's storage internal to a processor for registers and things. Since you're doing it all yourself, you need to allocate a place for those in your program, but they won't be in the 1K memory space for your virtual processor. They will live outside that, in the "processor object" if you will. (That is, they won't be addressable.)

So you will have the *registers*, which are your R0-R5, the IP, SP, status flags, etc - all "internal" to your simulated processor. And then you'll have the 1K "memory" the processor operates on. Memory and registers are completely separate things. You said you can't do memory-to-memory moves, which seems reasonable. (You couldn't do that, for example, on a 6502. You always had to load from memory into a register and then store from the register to another memory address.) But that's *not*

MOV R0, R1

That's moving between two of your virtual registers! In your virtual processor, that's an internal operation, not involving your virtual processor's "memory" at all.

I know it will be a bit confusing at first, but you need to keep straight what's virtual from what is not, keep straight what things look like from the virtual processor's point of view vs what it looks like in the code that's *emulating* it all. The registers R0-R5 are just variables in your emulator, and as such, they will be in your program's data segment, but from the point of view of the instructions running in your virtual processor, they will be in the virtual processor as registers.

The thing to keep in mind is that you're going to be implementing all the instructions for this processor using 8086 instructions. And the instructions will be opcodes that you'll read as numbers. You won't see "MOV". You'll see an 8-bit number like 0x63. You'll need to read that byte at the IP, go "this is a MOV instruction" and then read whatever additional bytes you need to make up the operands, by incrementing your IP and reading subsequent bytes from your memory array. (It's all clear to me because I used to read and write the actual bytes by hand, and I even wrote a 6502 emulator on the 6502-based Apple computer I had, just for fun. Yes, heady days. lol The thing is, you need to allocate a place for each virtual register in your emulator's memory, just as you're doing.)

Think of the instruction as being arbitrary "tokens".

Hopefully that makes things clearer. If not, then let's discuss some more, at this level. Because I feel like there's a conceptual disconnect that's going to make it hard for you to do what you need to do until the light bulb goes off. :) Perhaps a picture would help...

HegemonKhan
We're using the .586 MASM build (newer than the .386), using express VirtualStudio as the IDE, I don't think we're using or making any actual cpu emulation program, only using variables (which are in the DS, which makes them mem, so mov R0, R1 would be a mem to mem move still, I think, unless I'm completely not understanding this emulation/virtualization concept in your post) and etc to try to emulate/mimic/substitute the actual 16 bit registers (though we're working with the 32 bit registers but that doesn't really matter much as we can refer to the 16 bits, and the 8 bits, divisions of them) of the MASM (that I've seen in trying to research online of/using the various different languages: that seems a bit too advanced for us), though we are going to run a 'machine.bin' file (which may be in binary) given to us and a starting program to work with/add to (with code on how to do the winAPI stuff of reading a file, as only learned how to IO to console so far), to see if our emulated cpu program works. Though maybe I'm completely wrong, and you're understanding the assignment better than I am, which is very likely.

I'll email or talk to my professor (wed-tomorrow) and hopefully be able to get my confusion straightened out on exactly what/how we're suppose to be doing this assignment. Not, "how-how" to do the coding, just what exactly we're suppose to be trying to do with the assignment. At least I'm too dumb/stupid to be figuring this out, hence my confusion, and asking here in hopes of getting some guidance/hints/understanding of what I need to do, sighs. I'm not sure even how to do the instruction set creation, as I don't know what/how they'll be used... argh. So many questions/confusions, sighs. I really want to learn assembly, and I kind of am getting it slowly, but this stuff is tough (at least in trying to figure out what we're to do with this intentionally vague-info'ed assignment), I'm struggling with it, whereas, thanks to quest, I haven't hardly struggled with the higher language classes (C++ and Java).

Unfortunately, that post confused me a bit more, let me see if I can get some clarity/guidance from my prof directly, and then we can go from there, if I still need help, laughs. I will be trying to understand/digest your post too, but I'm going to get my other school work done and out of the way today, so I got the rest of the ~week to try to get this assembly lab figured out and done and working, laughs.

HegemonKhan
okay... in trying to make some sense of all of this...

I'm guessing that the '1024' byte array variable 'program_buffer byte 1024 dup (?/0)', since it *IS* being used to hold the read data from the 'machine.bin' file that we're using to test our program, would be used for the 'reseting to location 0', and not my own written MASM program's DS and/or CS, does this sound right, or completely wrong?

Then, using the address location of my '6' byte array variable 'RA byte 6 dup (?/0)', I can index/scale/direct-offset (via as adjusting "ip") to the specific sub-segment (R0-R5: virtual/emulation/substitute registers) of it. And, if I understand correctly, I can make the address size of it be 16 bits by another means (such as maybe the 'word ptr' or something to that effect, which I would need to be 16 bits in order to do the indexing/scaling/etc to the subsegment R0-R5 "registers", anyways, right?)

Are then the given instruction set's opcodes, refering to the memory addresses in the 1024 byte array variable (via adjusting "ip") ?? Or, am I to create the instruction sets in my own program, refering to them via the DS (if they're to be variables) and/or the CS as (if they're to be) Labels/Procedures/Macros?

Is this the correct setup/design I created (I know I'm asking about what I've done already, as it seems to make sense to me...), or do I have it completely wrong?

jaynabonne
First, if your instructor is giving you a machine.bin file, then part of the assignment must be the definitions for the various opcodes/instructions you need to implement. Is that the case? Because the structure of the instruction set can help guide how you implement the various (virtual) registers. I think if you could look at those instructions, you'd see what they look like, what kind of operations can be performed, etc. Without being given how the actual instructions look byte-wise, you'd have no chance of running the .bin file.

I'm guessing that the '1024' byte array variable 'program_buffer byte 1024 dup (?/0)', since it *IS* being used to hold the read data from the 'machine.bin' file that we're using to test our program, would be used for the 'reseting to location 0', and not my own written MASM program's DS and/or CS, does this sound right, or completely wrong?



Yes and no. The 1024 byte array is your simulated memory. Your virtual instruction pointer will be an index into that memory. (Your simulated processor's memory space will range from 0 to 1023). You're going to have a virtual IP (which will probably just be a variable you define). That's the thing: you can't emulate or simulate a processor using your own processor's resources (instruction pointer, stack pointer, status flags, etc). You'll need all those resources to write the code that does the actual simulation. So you're going to need something else to hold those values. The simplest approach is to have memory variables that represent those virtual internal registers. (Again, think of how you'd do this in Quest. The values have to go somewhere.) So to reset your virtual processor, it would be as simple as

mov virtual_ip, 0

where virtual_ip is your variable that holds the current instruction pointer. But the missing part is that you will then *use* that virtual instruction pointer to index into your 1K virtual memory and read values (the .bin file, in this case) a byte at a time and execute them based on what the opcode is that you've fetched.

Then, using the address location of my '6' byte array variable 'RA byte 6 dup (?/0)', I can index/scale/direct-offset (via as adjusting "ip") to the specific sub-segment (R0-R5: virtual/emulation/substitute registers) of it. And, if I understand correctly, I can make the address size of it be 16 bits by another means (such as maybe the 'word ptr' or something to that effect, which I would need to be 16 bits in order to do the indexing/scaling/etc to the subsegment R0-R5 "registers", anyways, right?)



The 16-bit accesses into your simulated memory block actually have nothing to do with the R0-R5 registers. Those registers are 8-bit *data* registers. To fetch instructions, you'll use your instruction pointer. The instruction pointer needs to be 16 bits, since it's effectively a 16-bit address (even though, really, for you it's just an offset into your 1K block). To read data, that will come from the operand part of the instruction.

Let me give an example, using the simplest processor I know well, the 6502. The 6502 has three 8-bit data registers, A, X, and Y. It also has an 8-bit stack pointer, an 8-bit set of processor flags, and a 16-bit instruction pointer. The instruction to load a value into the A register (accumulator) is - not surprisingly - "LDA". But if you look at the docs, you'll see there are 8 variants of LDA depending on what your addressing mode is.

To load an immediate 8-bit value, it's simple. The opcode is $A9 (the $ means hex in this case), and the value to load into A immediately follows it;

LDA #$64 -> $A9 $64

So when the processor is reading instructions, if it sees $A9, it goes "That's loading the accumulator with an immediate value". So it reads the next byte value and then puts that in the A register. If you were emulating that, you'd do the same thing: you'd read the $A9 by indexing into memory; then you'd increment your IP. Then you'd see it was a LDA immediate, read the next byte and then increment IP again. Then you'd store the value in the A register.

Another addressing mode is "absolute". In that case, the load is coming from a memory address, which is specified in the instruction. Having a 16-bit address size, that takes two bytes, which are read in succession and then assembled into a single value to put on the address bus. If you, say, wanted to load the value at memory location $4010 (and given that the opcode for LDA absolute is $AD), you'd have:

LDA $4010 -> $AD $10 $40 (low order part of the address is first on a 6502. On a 68000, it's reversed. You need to know your endian-ness.)

So when you encounter a $AD, you know you need to read *two* following bytes to get an address which you then read from.

That gives you two places you will use 16-bit addresses in an 8-bit world: to fetch instructions and operand data via the IP, and to read absolute (or indexed or whatever) data specified in instructions. But I doubt you will ever use the 8-bit registers as an address to fetch data, unless there is some special mode where you can combine them to generate an address.

Are then the given instruction set's opcodes, refering to the memory addresses in the 1024 byte array variable (via adjusting "ip") ?? Or, am I to create the instruction sets in my own program, refering to them via the DS (if they're to be variables) and/or the CS as (if they're to be) Labels/Procedures/Macros?


I may have answered some of that above. The opcodes are data values living *in* the memory. You will fetch them one at a time, see what they are, and execute them. (A crude implementation of this would a large if/then or switch statement, but you would most likely use a jump table of addresses, with a function per supported opcode.) They may refer to data in-stream as well (as shown above). But you will have to write code to actually implement each instruction. For example, to implement the LDA immediate above, you might have (pseudocode) like:

get next byte pointed to by IP
increment IP
store that value in the A register

For the absolute case, it would be something like:

get next byte pointed to by IP and hold it somewhere
increment IP
get next byte pointed to by IP
increment IP
combine the two bytes into an address
get the byte pointed to by that address
store that value in the A register.

You're going to have code like that (but real code) for each and every instruction. You're implementing the logic for this virtual processor, including all the functionality for all the opcodes. (In actual processors, this is termed "microcode". Since you're not inside a processor, it's just "code acting like microcode".)

To sum up, what you have so far is a rough beginning. You'll need:

- R0-R5 as byte values (memory locations to hold the simulated registers)
- IP as a word/16-bit value which points into your memory to fetch the next instruction.
- SP (stack pointer): I assume, since almost all processors do, but it depends on what your instructions are.
- Flags: these hold the status result for the most recent operation. (For example, decrementing a register will set the Z flag if the value goes to zero.)

All this stuff that's been done for you when you use your own assembly instructions (updating registers, moving data around, etc), you now need to implement yourself for this virtual processor.

I don't know if that's making any sense or helping. I'm actually surprised they're having you do this, if it's a beginning assembly class. Though it will give you good insight into what your own processor is doing by having you effectively write your own. :)

HegemonKhan
ah, that post makes much more sense to me (I understand it pretty well), just have to study some parts of it specifically.

I still have one remaining question (for now), which I couldn't quite discern from your post:

are the virtual registers (R0-R5) suppose to be located within the '1024 array variable memory segment' or outside of it ?

also, about the opcodes... (okay that's actually one more question, lol)... (I have to get to class now, so I haven't been able to read your post more closely/deductively, so that's why I'm asking this question, even though I could maybe have discerned it on my own... when I get back from class to read your post more closely)...

so they're the values stored in an address in the '1024 array variable memory segment', or are the opcodes the addresses and we get their values, to 'if check' what operation/instruction-set we do? or, are the opcodes stored outside of the '1024 array variable memory segment' ???

jaynabonne
The virtual registers are outside the memory segment, just as in the x86 processor, where you have the AX, BX, CX, etc registers which are on the processor chip but aren't part of the main memory (that is, they don't have a physical address in main memory).

And the opcodes (operational codes, or instructions) are just numbers stored in memory. They have meaning to the processor when being interpreted as instructions (e.g. 1 = load memory, 2 = store memory, 3 = move register to register, 4 = add a value to a register, 5 = increment a register, etc - all things I just made up, by the way), but at the end of the day, they're just numbers. That's one of the magical things about computers - code is just data. :)

Imagine I tell you the following: I'm going to give you a string. As you go along the string, the following rules apply.

- When you hit an 'A', that means load the value of the number that follows. So "A0" means "load the value 0".
- When you hit a 'B', that means add the value of the number that follows to what you have so far. So "B5" means add 5.
- When you hit a 'C', that means subtract the value of the number that follows from what you have so far. So "C4" means subtract 4.
- When you hit a 'D', that means to increment the value you have,
- When you hit a 'E', that means to decrement the value you have,
- When you hit a 'F', that means to print the value you have.

What would be printed by the following "program"?

"A3B8DDC5EF"

To figure it out, you'd step through the string, character by character. First, you load the character at index 0, and you get an 'A'. So you execute the code for 'A', which is to load the value of the next number. You look at index 1, and you see 3, so you load 3 into your accumulator. Then you increment your offset (instruction pointer or program counter) to 2. (Your accumulator is now 3.)

You then load the value at offset 2, and you find a "B". So you execute the code for "B" - grab the number at the offset past the "B". add it to your accumulator, and then increment past that number. (Your accumulator is now 11.)

The next "opcode" at 4 is "D", which is to increment your accumulator. You do that (getting 12). It has no arguments, so your offset of 5 is the next instruction.

You keep going. You increment the accumulator (D at offset 5), subtract 5 from it (C5 at offsets 6 and 7), decrement it (E at offset 8) and finally print it (F at offset 9) -> resulting in 7 (if I did my math right).

That's how a processor works, The "offset" pointing into the string is your 16-bit instruction pointer pointing into memory. The values you read (the A,B,C, etc) are your opcodes, In your virtual processor, they'll just be 8-bit values, ranging from 0-255. That gives you 256 possible opcodes, which you need to know the definitions for and write code for.

Hope that helps, and good luck! (And let me know if you have more questions.)

HegemonKhan
Thank you for that explanation (sorry about the question about where the registers are stored, after I posted I'm like that was a dumb question, as the registers are separate from memory, the CPU has its own limited memory array/space/segment, which is, *LOGICALLY* - somehow that I think is still a bit of a mystery to even computer theory/theorists on how it is able to work, lol - as I've come across in my research, divided up into sub-segments, some of which those sub-segments, are better known as the registers, 32:eax/16:ax/8:a, ebx/bx/b, ecx/cx/c, edx/dx/d, etc. These are the fastest memory sources, as they're apart of the CPU, whereas RAM is separate, bussed off on the mother board, as the "mem sticks/bus", so it's slower than the register memory, but they're less volative than the registers, which require variables or the stack-pushing-popping to preserve their data, but RAM is still volatile too obviously, especially compared to secondary memory, storage disk space, such as a HDD or ODD, extHDD, flash, etc. And even these are still volatile due to entropy and and etc wear of the materials and etc physics/chemistry stuff, so after say maybe 10 years, contrary to the layman's belief that the data lasts forever, your data on your burned cd, is gone/corrupted/damaged permanently ---except maybe not in zero, well near zero, Kelvin, lol), but I wasn't sure about how the instruction set/opcodes worked and where they're located, so that was really helpful.

As I already mentioned/explained above, I finally am understanding well about how memory itself works (already: prior to your posts, thanks to some of my C++ and Java classes and a lot to my current Assembly class), so I do/did already understand about base addresses, offsets, indirect, and etc. I just didn't know how the opcodes/instruction set worked at all.

--------

Your posts have been of such great great tremendous help, THANK YOU VERY MUCH, Jay! I think I can hopefully figure out how to do the lab now on my own (and maybe a little more help from the prof), but I was asking a classmate today, and he said that I was on the right track, thanks to what you explained and helped me with, with these posts of yours. So, hopefully, I won't be needing to ask you any more, and can figure the rest out on my own, HK crosses his fingers, but maybe I'll still be stumped on some things... HK is hoping he can do it now!

Probably a little too helpful (and not that I'd: "not looking a gift horse in the mouth", lol, but I do like trying to do as much as I can on my own and hate asking for help, but sometimes I do need help, no matter how hard I try on my own)... though I do have a much better understanding now, which I didn't priorly. Also, I already knew about some of the "too helpful" content, whereas I was more stuck on just the exactly what the assignment wanted me to do in general terms / design, as I couldn't induce it on my own, due to being new to Assembly. It's like expecting someone to know (or to figure out that) something when they don't know that something (or nor the parts needed to figure out that something. Imagine Sherlock Holmes trying to solve a case without knowing what pieces to even be looking for, lol). If I already knew assembly well, then I'm sure, I'd know exactly what was expected from the vague instructions, but I was quite lost until I got help from you Jay, as I don't already know Assembly, and thus couldn't figure out what I didn't know already. It's a bit of a cunundrum (can't spell), ya you obviously don't want to tell them how to do the program, but they need to have an idea how to do the program, to do the program. It's a near impossible balancing act.. Provide enough instruction to steer/guide them in the right direction or understanding, without giving instruction on "how-how to do the actual lab directly". How to give a hint without it being (or ending up as) an answer, is probably a better analogy that I just thought of.

------

Anyways, I'll try to get the same level of help from the teacher as I got from you and my classmate, so I appreciate all the help: you, classmate, and hopefully the teacher too, crediting all of your help of course. I needed it, as I was really lost/confused on my own.

jaynabonne
Glad I could help, and good luck! :)

(And I hope I didn't cross too many lines. Sometimes the hardest part in answering a question is working out what's being asked.)

HegemonKhan
laughs, that too... on your side, sorry for the confusion on what I was asking about or for, it's not easy when you don't know yourself what to ask about or how to word your question correctly, sighs. So, rest easy, as indeed I'm to blame, for any possible "too helpfulness", due to my unclear posts and questions

HegemonKhan
So, I talked to the prof on some things that I was still confused about and wanting to make sure that what I gleamed from the help from you Jay and others, is what was wanted from the lab/project/program/assignment...

WOW... just WOW... all this lab is... is a Script Dictionary !!! HK bangs his head on the desk, just a bloody... script dictionary... design! HK laughs madly, lol.

I was confused on the instruction set/opcodes, I thought I had to construct them like functions or "like add the opcode address and the reg1 adress and the reg2 address togehter, or somehow put those 3 address values together to get the adress of the instruction set/opteration or whatever lol, and then magically know what kind of function/action to use it for... which was really purplexing me"

Only difference of course is that this is assembly, using addressing, as the 'opcode values' are just the 'keys' of a Script Dictionary, and the scripts are just achieved also via (a simple method of many methods) assembly's old school usage of "goto" jumping. Intead of 'if', as this is assembly (though I think assembly does have 'if' instruction sets added to it now too) we just use 'cmp (compare)', for pseudocode example:

file: 1 2 3 4 5 6 99 87 55 30 70 20 10 0 78 -> put (read) into -> mem_array

opcode1 = 5
opcode2 = 30

:start:
cmp mem_array_offset[index] == opcode1, if true, goto 'add' code line/block, and do the given operations of 'add'.
cmp mem_array_offset[index] == opcode2, if true, goto 'mov' codeline/block, and do the given operations of 'mov'
etc etc etc
increase index
goto start

:add:
add reg1, reg2

:mov:
mov reg1, reg2

etc etc etc

-----------------------

this lab is so easy... now... (aside from not screwing up my addressing or whatever mistake, and having to trouble shoot it/them, lol)

--------------------

AGAIN, THANK YOU JAY, your posts really helped a lot in getting the initial basic understanding of how this is done in assembly, I would never had figured any of this out on my own... sighs. I'm stupid/dumb...

but I'm also...

HK, THE SMARTEST DUMBEST PERSON! :D

(I really like this title I came up with myself for myself, hehe, from struggling so much with understanding programming, but then having moments of, for me genius, in suddenly understanding the programming when just a few moments ago I was totally clueless, this has happened so much to me as I'm trying to learn more and more programming, doing more and more coding labs/projects and etc)

(I'm really happy that I now finally understand this assembly lab, maybe I do have a chance in this class, ... if I can start doing better on tests... grr. HK still though dances around in joy... knocking over everything... as he can't dance on top of being super clumsy... lol)

jaynabonne
The light bulb has gone off. :idea: I'm glad it makes sense now. I tried to hint at that a bit (or more) above, and I even had "script dictionary" in my head at one point as a Quest analogue, but it never made it into my response. I think you're going to enjoy this one, once you get into it. It's magical to watch your code executing other code. :)

HegemonKhan
(bloody need to install VS community for the binary editor, so I can see the machine.bin, for testing/debugging/checking my program, last time I tried to install VS community it's trial subscription ran out, but isn't there suppose a free version of it? As the free 'express' version I've been using doesn't have the binary editor, which I need, grr. I better not have any isssues... I need to work on my program... as the technical details of it are still a bit murky for me... despite understanding that I'm doing a script dictionary now)

-------------------------

quick question:

so I got my 'machine.bin' file with a bunch of hexidecimals in it, (some of those represent the opcodes), which we place that data into an array variable, and iterate through it seeing if those values are the opcode values and performing the opcode instruction/operations.

some of the given instructions say to use "address", are they refering to the array containing the machine.bin data, meaning that we're to change those values based on our instructions, which will be re-read over and over again, with those new values (instead of the original values), possibly being an opcode, and this continues until we hit the terminate program opcode?

for example, say the mem array (which initially contains the original hexidecimal values/opcodes):

00 01 02 03 04 05 06 07 08 09 00 01 02 03 04 05 06 07 08 09

and lets say the given instruction sets are:

00: add
01: sub
02: mov
03: store address ; store value in reg1 (R0) into "address"
04: load

let's pretend that 'store' changes the (addresses/offsets that have the) '09' values to say '02', as seen below:

00 01 02 03 04 05 06 07 08 02 00 01 02 03 04 05 06 07 08 02

and thus when it iterates through again, it now does 'mov' operations when it gets to those addresses (and thus in this example, there's no 'store' operations now occuring), is this what is suppose to happen?

-------------

or, am I getting confused, and I just iterate through the mem array (which holds the machine.bin file; the hecidecimal/opcodes) only once, (the iteration is through the loop/goto checking of the opcode values to jump to what operation to perform), and the manipulation is based on whatever the given instructions are, whether they manipulate the values in the mem array or not?

----------

also... from this given assignment statement:

CPU contains 6 registers each 8 bits (R0 – R5 valued 0 – 5)

does this mean that the registers are initially set to having those values?:

mov R0, 00h
mov R1, 01h
mov R2, 02h
mov R3, 03h
mov R4, 04h
mov R5, 05h

as I'm wondering this, due to having 'add and sub' dealing with reg operands

HegemonKhan
I'm still struggling a bit with how to use the given instruction sets with the 'machine.bin' (opcodes) file, sighs. If I could just get some help with walking through one of them, would be immensely helpful. There's so much information overload for me, that I'm overwhelmed, confused, and lost, sighs.

our given CPU information:

cpu contains 6 registers, each 8 bits (R0-R5 valued 0-5)
16-bit address space
1k of RAM
on reset the cpu begins execution at location 0
cisc style instruction set, instruction length varies with instruction
register operands are 1 byte
all memory addresses are 2 bytes long and are stored in big endian format

our given instruction set:

menumonic, opcode (hexidecimal), operand 1, operand 2, notes

ADD, 11h, reg1, reg2, reg1=reg1+reg2
SUB, 22h, reg1, reg2, reg1=reg1-reg2
XOR, 44h, reg1, reg2, reg1=reg1 XOR reg2
LOAD, 05h, reg1, address, load reg1 with value at address
LOADR, 55h, reg1, adress, load reg1 with value at (address+reg1)
STORE, 06h, address, ---, write value in R0 to address
STORR, 66h, reg1, address, write value in R0 to (address+reg1)
OUT, 0CCh, reg1, ---, send value in reg1 to output (screen)
JNZ, 0AAh, reg1, address, if value in reg1 isn't zero then next instruction is at address
HALT, 0FFh, ---, ---, CPU halts - terminate program

our given 'machine.bin' file (only first 32 byte address slots):

...............................................................................................................A......B......C......D......E......F
..........................0......1......2......3......4......5......6......7..........8......9.....10.....11....12....13....14....15
0000 0000 ~~~ 05 ~ 04 ~ 01 ~ A3 ~ 44 ~ 02 ~ 02 ~ 05 ~~~ 00 ~ 01 ~ A0 ~ 05 ~ 04 ~ 01 ~ A3 ~ 11 ~~~ .... D... .... ....
0000 0010 ~~~ 00 ~ 04 ~ 06 ~ 01 ~ A0 ~ 05 ~ 03 ~ 01 ~~~ A1 ~ 44 ~ 01 ~ 01 ~ 11 ~ 01 ~ 00 ~ 55 ~~~ .... .... .D.. ...U
0000 0020 ~~~ 01 ~ 00 ~ A0 ~ 11 ~ 03 ~ 01 ~ 44 ~ 05 ~~~ 05 ~ 11 ~ 05 ~ 00 ~ 44 ~ 00 ~ 00 ~ 11 ~~~ .... ..D. .... D...

-------------

my understanding so far:

(I've been using these vids, https://www.youtube.com/watch?v=tjZ2Mh_MV6g, https://www.youtube.com/watch?v=Xj5BqNHi1X8, and etc following vids, best vids I found so far on explaining this stuff. I know he's working with 8085, but he's explaining how the instructions and opcodes work...)

05 -> mov R0, 05

or... am I suppose to also use: 04, 01, A3 ???, as the vids talk about, something like:

mov 04's address (dest), 01's address (source), A3's address (next instruction) ???

what also is confusing me is the '44' instruction, as it's an XOR of reg1 and reg2, but I've not put anyhting into (haven't used) reg2... so I'm confused...

jaynabonne

CPU contains 6 registers each 8 bits (R0 – R5 valued 0 – 5)

does this mean that the registers are initially set to having those values?:



To answer in easy order...

To be honest, I don't know why the registers would start with those particular values, but I can't work out a different interpretation of that statement that isn't redundant. I think in normal processors, the registers either start out 0 or have random values, depending on how the bits come up (apart from the status register, which needs to be reasonably defined). You wouldn't want to rely on a register having a particular value anyway. (I'm ignoring processors that dedicate specific registers to particular values.)

Or, am I getting confused, and I just iterate through the mem array


You just iterate through the mem array. You mentioned there's a terminate opcode, so you would stop there. And if you have jumps or branches or "go to subroutine" type instructions, then those would alter the instruction pointer, allowing you to jump around in the memory array. But beyond that, you just keep plodding forward.

And now about the address and other operands.

(Actually, I see you just asked another question, so I'll pick it up there.)

jaynabonne
The key is to know that the memory you're stepping through is not just the instructions but the data as well, and the amount of data bytes depends on the instruction. Let's do some. I assume that where it specifies a register in an instruction, that's a single byte index for the register, and an address is two bytes. Looking at the data, it looks like addresses are big endian (high byte first), since they need to be within your 1K block.

The first byte is 5. That's a load. The next byte is the register to load into, followed by the address (as two bytes) to load from. So you'd have:

05 (LOAD) 04 (register 4) 01 A3 (from address 0x01a3) => LOAD R4, $01a3

Next you have:

44 (XOR) 02 (R2) 02 (R2) => XOR R2, R2 (which zeros it)

05 (LOAD) 00 (R0) 01 A0 (from address 0x01a0) => LOAD R0, $01a0

And so forth. I can do more if you want, but maybe that helps?

HegemonKhan
I just editted my previous post (the one with the assignment requirements and info)... if you wouldn't mind looking over it again... to see what I understand/don't understand, and am confused with.

jaynabonne
It looks like you have the right idea. But addresses are 16 bits - two bytes. (See my post.) The XOR is actually the same register, which makes it 0.

HegemonKhan
ya, I just started reading your post with this in it:

"The first byte is 5. That's a load. The next byte is the register to load into, followed by the address (as two bytes) to load from. So you'd have:

05 (LOAD) 04 (register 4) 01 A3 (from address 0x01a3) => LOAD R4, $01a3"

I think I get it now!

so the "6 registers of a byte: R0-R5 with values 0-5", the values '0-5' is just to correlate to the data in the mem array:

data in mem array -> virtual register (variable)

00 -> R0
01 -> R1
02 -> R2
03 -> R3
04 -> R4
05 -> R5 // well, this actually can't be done, lol, as 05 is already an opcode... lol

jaynabonne
Great! :)

jaynabonne

05 -> R5 // well, this actually can't be done, lol, as 05 is already an opcode... lol



It depends! It depends on whether you encounter it at the time you expect an instruction or not. You have to keep in mind that you're marching through the array one byte at a time. You read the opcode and then increment the instruction pointer. If the opcode needs a register, then you read that byte, but at that point, you're not processing it as an instruction but as data for the instruction. Then increment the IP and read the high byte for the address and you increment again and read the low byte for the address and then you increment again to prep you for reading the next instruction.

Data values have meaning in context only. If you're at the point of reading an opcode, it's treated as an opcode. If it's at the point of reading a register, it's treated as a register. You must parse it sequentially, in context. You can't just say every 5 is an opcode. It might be a register value or part of an address even.

HegemonKhan
ah, thanks for clearing that up as well!

-------------

thank you so much... hopefully now I can get this entire assignment done without any more difficulties, ... but probably something else still will come up... sighs.

I've never done assembly before (unlike learning the high level languages thanks to/from 3 years of learning quest), so unfortunately, I wasn't able to connect/infer/understand these questions on my own and hours of research (I tried... sighs), so I needed these "example walkthrough helps and explanations" you've been giving me in your posts. I think a lot of the class is struggling too, except the ones who probably already know assembly a bit (or they're just geniuses... or better at researching/online searching than I am). For someone totally new to assembly, these constructs of how to match the data with the instructions and my initial difficulty with understanding that (some of) the data is the opcodes and you're basically doing a Script dictionary, using the opcodes as the 'keys' and the goto looping/jumping based on checking if the data is an op code and what opcode. I just think we needed more of a walk-through like you've been doing for me, to get this stuff realized, as I wouldn't have ever been able to infer these constructs on my own.

---------

I really appreciate the help you've given me! I unfortunately don't have any network for getting help (aside from the prof). I haven't been able to get any of the other students to be study-buddies for/with me yet. I'm still trying to work on it, but they're not that interested, sighs.

You've been such a life saver (hopefully, I won't be needing to bug you too much more with this or future assignments), THANK YOU VERY MUCH!

jaynabonne
If you get a chance, look at what the assembly code *you* write generates in machine code, either via a disassembler or a debugger. That will help you to see, from a real processor, how things are structured as well.

HegemonKhan
extremely quick question, for the:

'LOADR' and 'STORR' instructions:

LOADR, 55h, reg1, address, load reg1 with value at (address+reg1)
STORR, 66h, reg1, address, write value in R0 to (address+reg1)

for the '+reg1' in '(address+reg1)' do I use the offset of the specified register (aka: 0-5), or do I use the value held in/at that specified register (whatever) ??? As in either case, I could be jumping forward to another segment of 1 (value data) or 2 bytes (address data), though so far in the 'machine.bin' the register offset has been 0 or 1 (meaning we're just using storing/loading the value from/into the msb/lsb of the address), I think.

that was probably confusing...

for pretend example, if I had:

RA = register array: R0, R1, R2, R3, R4, R5
RA[0] = 100h
mem_array[index_offset_var] = 00h ; 00h -> RA[00h->0]

would I do conceptually:

mov mem_array[index_offset_var + 0], RA[0]

or this conceptually:

mov mem_array[index_offset_var + 100], RA[0]

???

jaynabonne
The general answer is that you use the value. The offset is not really very useful.

--- Longer: examples ---

Let's try this example (which doesn't use R0 for everything, since it makes a difference):

55 01 23 45

This is a LOADR using R1 and address 0x2345. And let's say R1 has value 0x20. Then in this case, you'd have:

R1 <- [address + R1]
<- [0x2345 + 0x20]

So R1 would be loaded with whatever value is at memory location 0x2365.

Similarly:

66 01 23 45

is a STORR with the same arguments, and R0 is implicitly used as the source. Assuming R0 has value 0x42, then:

[address + R1] <- R0, or

[0x2345+0x20] <- 0x42

So you'd store the value of R0 (0x42) into the memory address 0x2365.

jaynabonne

so far in the 'machine.bin' the register offset has been 0 or 1 (meaning we're just using storing/loading the value from/into the msb/lsb of the address)


And I have no idea what this means, which scares me a bit. The register "offset", if you will, has nothing to with the bits of any address. It just specifies which register to use, which is outside of any address calculations as far as the processor is concerned. (Your implementation might use it to index, but that's not a concern in the higher level, virtual processor model.)

HegemonKhan
err, I meant the data values (that are for being the instruction sets' regX operands) in the machine.bin file

so, just ignore this bit, lol.

--------

Thanks for answering about my index mode question.

(I was tired when I asked that question, I knew that answer... but my mind was tired and pre-occupied with everything else of the lab, lol. So, my mind wasn't focused on how the indirect address mode works. Getting sleep and not being tired, helps a lot)

jaynabonne
I know the feeling! :) I've made some really stupid mistakes when tired that I looked at the next day and thought, "What were you thinking?"

HegemonKhan
sighs, I'm just stupid, sighs. I just can't figure this stuff out on my own, sorry for asking for help again, but I need another walkthrough help with this stuff (I'm pretty much asking you do to do the entire assignment, as we're just given this for us to figure out, which I just can't do my first time without having a demonstration, sighs. I'm just stupid, sighs. I can't learn this stuff on my own for my first time, I need to see how it works and is done, for me to start to understand this stuff, sighs). HK, is not the "smartest stupidest/dumbest person", HK is just the stupidest/dumbest person, sighs.

I'm really sorry for needing you for such a "crutch", but our prof just gave this assignment, and as can be seen, it's really difficult for me to figure and do all of this stuff, for my very first time, on my own without much help nor practice on how to do this assignment, and me being stupid, I can't figure it out like other (smart, unlike me) people can, sighs. All we do in class is a lecture on the instruction points (just more overwhelming information), wish we could be given practice labs and helped/guided through them (and more slowly too, as I'm new to the different numbering systems, it takes me a bit longer with hexidecimal and binary, compared obviously to decimal, lol)... so we can learn how to actually do assembly labs, sighs.

There's no tutoring that I'm aware of for assembly at the school/college I go to, none of my colleagues are interested in helping, I can't find any videos online and I really have a hard time learning from online videos anyways, and our prof works, so we really only get him for the single day of the week that is our class (I really hate programming classes which meet only once a week). I never asked a teacher for help tutoring before (assuming we could even be able to schedule for a tutoring session), so it would be really socially awkward for me, ..."hey prof can you tutor me, so I don't flunk your class?" I don't care about grades at this point, but I'd like to learn assembly (It'd be nice if I could get that costly education from the school that I paid for... my money your service of education, a very simple contract/agreement, gotta love the U.S. school system, no wonder we're like 38 or wrose in world)... can't become a programmer if I can even program in assembly, sighs. I got no network of friends or other people I know to get help/tutoring from, except you Jay.

long sob story short, I can't thank you enough Jay! Is there any way I could pay you for your help? Some kind of paypal or something maybe? I don't know how online transactions work very well, as I believe you're still lving in europe, and I in CA, U.S.

------------

our given CPU information:

cpu contains 6 registers, each 8 bits (R0-R5 valued 0-5)
16-bit address space
1k of RAM
on reset the cpu begins execution at location 0
cisc style instruction set, instruction length varies with instruction
register operands are 1 byte
all memory addresses are 2 bytes long and are stored in big endian format

our given instruction set:

menumonic, opcode (hexidecimal), operand 1, operand 2, notes

ADD, 11h, reg1, reg2, reg1=reg1+reg2
SUB, 22h, reg1, reg2, reg1=reg1-reg2
XOR, 44h, reg1, reg2, reg1=reg1 XOR reg2
LOAD, 05h, reg1, address, load reg1 with value at address
LOADR, 55h, reg1, adress, load reg1 with value at (address+reg1)
STORE, 06h, address, ---, write value in R0 to address
STORR, 66h, reg1, address, write value in R0 to (address+reg1)
OUT, 0CCh, reg1, ---, send value in reg1 to output (screen)
JNZ, 0AAh, reg1, address, if value in reg1 isn't zero then next instruction is at address
HALT, 0FFh, ---, ---, CPU halts - terminate program

our given 'machine.bin' file (only first 32 byte address slots):

...............................................................................................................A......B......C......D......E......F
..........................0......1......2......3......4......5......6......7..........8......9.....10.....11....12....13....14....15
0000 0000 ~~~ 05 ~ 04 ~ 01 ~ A3 ~ 44 ~ 02 ~ 02 ~ 05 ~~~ 00 ~ 01 ~ A0 ~ 05 ~ 04 ~ 01 ~ A3 ~ 11 ~~~ .... D... .... ....
0000 0010 ~~~ 00 ~ 04 ~ 06 ~ 01 ~ A0 ~ 05 ~ 03 ~ 01 ~~~ A1 ~ 44 ~ 01 ~ 01 ~ 11 ~ 01 ~ 00 ~ 55 ~~~ .... .... .D.. ...U
0000 0020 ~~~ 01 ~ 00 ~ A0 ~ 11 ~ 03 ~ 01 ~ 44 ~ 05 ~~~ 05 ~ 11 ~ 05 ~ 00 ~ 44 ~ 00 ~ 00 ~ 11 ~~~ .... ..D. .... D...

------------------------------------

endian-ness:

the 'machine.bin' is in big-endian, I think...

the 'machine.bin' is stored in a 'virtual mem' array variable (named: program_buffer) on the intel masm32's DS (data segment) ;does it keep the same endian-ness as the 'machine.bin' or does it convert it's endian-ness?

the 'virtual registers' are stored in an array variable (named: RA) on the intel masm32's DS (data segment)

intel masm32 uses little-endian

all memory addresses are 2 bytes long and are stored in big endian format

register operands are 1 byte

16-bit address space

could you explain what I need to do with the endian-ness?

(I have some guesses, but it'd be a waste of time and space for me to try to explain/write out my guesses and thoughts here)

(I'm aware/believe that I can use 'bswap' for dwords:32bits to convert between endian-ness)

(I'm aware/believe that I can use 'xchg' for the high:8bits and low:8bits of words:16bits:example ax: xchg ah,al)

(let me know if I'm incorrect about 'xchg' and/or 'bswap', or if you know of any other instructions that would work, or if I have to manually 'function' it in)

-----------------------------------------------

(I'm not sure about the 8085 you worked with, but the general purpose registers in 80386+ aren't as role-restricted as they originally were in the first cpus)

so, our first instruction:

05 04 01 A3
LOAD R4, value at address*

*does this mean just '01 h' (1 d) or '01A3 h' (419 d/t) or '0xA301 h' (41729 d/t) ???

*also, I believe that a byte (8 bits) can hold '0-256' unsigned (2^8) and '-128+128' signed (2^7), whereas a word (16 bits) can hold '0-65536' unsigned (2^16) and '-32768+32768' signed (2^15).

this is the initial algorithm I came up with (only the parts for the very first operation):

Pre_Start:

mov ebx, 01h ; starting at 'index 1' was wrong (I noticed the machine.bin seemed to start at index 1), so I changed to correct index of 0: xor ebx, ebx

Start:

cmp program_buffer, LOAD ; LOAD EQU 05h
je Load

Load:
add ebx, 01h ;this moves from '05' to '04'
mov eax, program_buffer[ebx] ;moves the value (0-5) into eax, which will be used as RA's index: RA[eax]
add ebx, 01h ;this moves from '04' to '01'

mov edx, word ptr program_buffer[ebx]
;Is this correct to be using a word value (01A3 or A301), as (I think I'm) suppose to put this into a byte... ???
;if I'm only working with a byte (01 h or 0xA3 h), than it'd be easy to use: mov (dl or dh), xxx
;if I'm working with a word (01A3 h or 0xA301 h), than I can either use: (movzx edx or mov dx), xxx

mov RA[eax], edx ; this is where my main troubles are ...

add ebx, 02h ; this is for jumping past the word address (01 h, 0xA3 h) to the next opcode: '44 h' (an XOR operation)

; I'm having enough trouble already, I haven't even started to learning of being aware of the flags' settings/clearings yet..., so I'm cheating below lol
xor eax, eax
cmp eax, 00h
je Start

-----------------------

P.S.

I can't even get the VS' linking to work with the 'win32api.asm' (prototype/header) file we're to use for the 'invoke' commands (it does all the pushes/pops for us), so after wasting huge hours to no avail of trying to do this, I luckily had the thought to just copy and paste its contents directly into my program, as It seemed just like a header/prototype file as with C++, and thankfully it worked... now I can finally debug... but I've got no experience with assembly and degugging, hence why I'm stuck and am asking if you can walk me through getting the algorithm/logic/syntax right for a instruction (the first instruction), as hopefully upon seeing how one is done, I can do/figure out the rest.

jaynabonne
First of all, you're not stupid! You actually have a good start here. If anything, I think it's not being explained properly. (And no, forget compensation. I'm happy to see you get to end of this. :) )

so, our first instruction:

05 04 01 A3
LOAD R4, value at address*

*does this mean just '01 h' (1 d) or '01A3 h' (419 d/t) or '0xA301 h' (41729 d/t) ???


As addresses are 16 bits, it would not be just 01 (that's only 8 bits). So it would definitely be the next two bytes to make an address. I had assumed initially that it was big-endian (as the address 0xa301 would be beyond your 1K "memory"), and you've just said it is, so that's confirmation.

So first, yes, the starting index is 0. :)

One thing that might help is to always increment bx after reading a value. That way, you'll always be prepped to read the next value in the stream.

mov al, program_buffer[bx]
inc bx
cmp al, LOAD
je load


load:
mov cl, program_buffer[bx] ; grab register value
inc bx
xor ch,ch ; clear high byte, so you have a 16-bit offset to use : (0:cl)

Now, as the x86 is little endian (low byte stored first) and this is big endian (high byte stored first), you can't just read the 16-bit value as is (it will be swapped). You can either:
1) read it as a word and then swap it. You can't use bswap, as that appears to only work for 32-bit registers. So I think your xchg method would work. Just be sure to read the value into a register with high and low parts. :)

mov ax, program_buffer[bx]
add bx, 2
xchg ah, al

2) read each byte separately:
mov ah, program_buffer[bx] ; high byte
inc bx
mov al, program_buffer[bx] ; low byte
inc bx

When you go to load, it will be a byte value. So you'd load it into a byte regsister (e.g. dh or dl, not dx)
mov dl, program_buffer[ax] ; ax here,as that's the address from the instruction, not the next address in the program

And then you should be able to store:

mov RA[cx], dl

There may be syntax problems with this, and when I programmed the x86, you couldn't do arbitrary things with registers. So if there are problems with those, then you might have to get things into appropriate registers. (e.g. si and di are good for indexing, so you might have to move cx into one of them, for example. Or maybe not. See how it goes.)

I think you have the right idea. Just go step by step and keep the right sizes for your registers. (And if you don't like all the cmp's for the various instructions, we can discuss a "jump table", where you put the addresses of all the functions in to a memory array and then just jump into it based on the index - but that's a bit more advanced. Think of it as a script dictionary. EDIT: Actually, ignore this. I keep forgetting that you don't have that many opcodes. With the amount you have, the if/then cmp sort of thing should be fine. Otherwise, you'd have a largely empty table.)

I'm at work now, so I might not be able to respond very quickly. (I'm being a bit bad at the moment, to tell you the truth, doing this, but what the hell...) So if you respond and I take a while, keep that in mind.

HegemonKhan
ya, I saw the table in my own endeavors in trying to learn more of assembly, but I'll learn it at a later time, as indeed our conditions (instructions in instruction set) are so few, so the multiple cmps suffice for now. I just need to get down these basics/fundamentals of how programming is done in assembly, then I can learn all of the different and/or exotic features/instructions/etc that exist for use later on. I'm already too overwhelmed with stuff, which makes it hard for me. I can't focus on or figure out what's important when there's too much information given to me (just like when I first started with learning quest, I was so confused by all of the terms, data types, element/object types, and etc).

----------------

thanks for pointing out my redundency of the index increaser (I had it in every instruction label-jump, lol, wasn't paying attention to that), which I can just put it once into the start loop.

Edit: actually, this doesn't work easily for my design... meh

------------------

ah, so the issue preventing me from putting the address' word value into the byte RA (register array) wasn't the two byte values of the address, but just their endian-ness (causing their value to be beyond a bit's max value range). I was confused how I was to put what I thought was the address' word value (01A3/A301) into my byte register... (or if I was just to place the high/low into the byte register, ignoring the other high/low bit).

is this correct?

I can do this? for example:

mov RA[0], dl/dh

but.. which do I use? the low or high? don't I need both low and high, as isn't that the value I want to store? I'm still a bit confused with this...

the dl/dh isn't seen as a word into the RA[0] byte, due to merely using the correct (doing the conversion) endian-ness ???

----------

it's just really awful/depressing/unmotivating/downer/despairing when you've tried so hard (getting huge headaches from all the thinking) and over long hours to try to figure out (in this case, programming) and you just utterly fail. I am stupid, I shouldn't be utterly failing with working so hard. The worse thing in the world: "working like a madman and not getting anything done"; that's the definition of failure/loser, of being totally stupid/pathetic, sighs. Ignore my sobbing/self-pity, as I'm just really hard on myself, I expect/demand high quality of myself, so it's very disappointing when I fail so badly at things, sighs. I just take failure really hard, especially when I try so hard too.

--------

Anyways, you're the greatest, Jay!, thank you soooooooooooo much for your help! A total life saver with this assembly learning! I'm really learning and understanding it so well now, thanks to you!

jaynabonne
The endian-ness only comes into play for values with multiple bytes (so 16-bit, 32-bit, 64-bit values, etc). It has to do with the order of the bytes within the larger element. So a 16-bit (2 byte) value can either be stored either [low 8 bits, high 8 bits] or [high 8 bits, low 8 bits]. A single byte is just stored, so there is no issue there.

You'll see endian-ness as an issue with addresses, since they're stored high-byte first in your virtual processor but low-byte first in the native x86 code. If you read it as a 16-bit value, the two bytes will be swapped. You either need to swap them or read them a byte at a time into the right spot. (That may be a bit redundant with my previous answer, but just in case that wasn't clear.)

Endian-ness should not have an effect when working with the registers, as the register index (the value in the instruction) is only a byte, and the registers themselves are only a byte. You do need to extend the 8-bit register index into a 16-bit value to use it as an index. I don't think you can use an 8-bit register as an index.

I can do this? for example:

mov RA[0], dl/dh

but.. which do I use? the low or high? don't I need both low and high, as isn't that the value I want to store? I'm still a bit confused with this...


dl and dh are two 8-bit halves of the 16-bit dx. You can treat them as two individual 8-bit registers or together as a single 16-bit register. In this case, you're only working with an 8-bit data value (the single byte you load from memory and want to put in the "register"). So the answer to your question of which of dl/dh to use is "whichever one you loaded the value into to begin with". Just pick one and use it. Either is fine. You don't need both high and low, as that would imply a 16-bit value; but you only have an 8-bit one. You can use any 8-bit register for your transfer.

HegemonKhan
this is what I have so far:

Pre_start:

xor ebx, ebx ;this is what I use for the incrementing index

mov ecx, LENGTHOF program_buffer
;I just read that the 'loop' has a limited range, so to be safe, am using manual decrement ecx checking-jump, though hopefully I don't have the same issue ;with the jumping (near vs far), as I've not tried to understand them, lol.

Start:

sub ecx, 01h
;used for the looping, do I need to use 'sbb' (subtract with barrow/carry?), as I don't understand carry arithmetic or nor bit carrying in general

cmp program_buffer[ebx], LOAD ;LOAD EQU 05h
je Load

cmp program_buffer[ebx], Store ;STORE EQU 06h
je Store

cmp program_buffer[ebx], ADD_ENUM ;ADD_ENUM EQU 11h
;ADD (and others) got color coded... so I'm using 'ADD_ENUM' just to be safe, as I don't want to over-write an instruction set operation, lol
je Add_Enum

cmp program_buffer[ebx], SUB_ENUM ;SUB_ENUM EQU 22h
je Sub_Enum

cmp program_buffer[ebx], XOR_ENUM ;XOR_ENUM EQU 44h
je Xor_Enum

cmp program_buffer[ebx], LOAD_R ;LOAD_R EQU 55h
je Load_R

cmp program_buffer[ebx], STORE_R ;STORE_R EQU 66h
je Store_R

cmp program_buffer[ebx], JNZ_ENUM ;JNZ_ENUM EQU 0AAh
je Jnz_Enum

cmp program_buffer[ebx], OUT_ENUM ;OUT_ENUM EQU 0CCh
je Out_Enum

cmp program_buffer[ebx], HALT ;HALT EQU 0FFh
je Halt

cmp ecx, 00h ;or it's: 01h
;need to figure it out (confused where/when ecx is needed to be checked at: 0 or 1), easy to test it, when I get to it...
je Finish
add ebx, 01h ;just in case a value doesn't match up with any of the compares, to advance it to the next value (next address)
jne Start

Load:

add ebx, 01h ;moves index from opcode address/value (05) to register address/value (04)
movzx eax, program_buffer[ebx] ;the 'movzx' fills in zeroes for the rest (unused) of the bit slots ;storing the register index into eax
add ebx, 01h ;moves index from register address/value (04) to the left byte of the address' address/value (01 of 01,A3)

movzx edx word ptr program_buffer[ebx]
;the 'type:word ptr' gets (in this case) 01A3 for us and puts it into edx (and movzx zeroes the remaining unused bits), which I think you do differently in your 8085/8086 code

xchg dh, dl ;this should switch the high and low bits (big endian to little endian), so now dx should be in little endian

mov RA[eax], ??? ;this is where I don't understand, as which am I to use, dl or dh, but don't we need to use both bytes (dx: 01A3/A301 whatever it is), which would be the value (419) ???

add ebx, 02h ; to skip past the the word address indexes to the next opcode value (44:XOR)

;to cheat/ensure that I jump back to Start, lol:
xor eax,eax
cmp eax, 00h
je start

Store:

(etc etc etc operations/labels, once I get few parts of the load algorithm understood, I should be able to get all of the other operations correct)


-------------------

Edit:

ah, so because its '419' it doesn't use the high byte... do I need to zero out that high byte? as I couldn't quite follow some of your code... (edit2: I understand that for the mov/transfer I can use either dl or dh, now, but am unsure if I need to zero out the high byte?)

and ya, I still have trouble separating/understanding/confused by address length/size and/vs their value length/size, as can be seen by my questions

jaynabonne

ah, so because its '419' it doesn't use the high byte



The key to your misunderstanding is in your question: is 419 actually an 8-bit value? (Hint: A single byte is in the range 0-255 (0-0xFF)).

More importantly though, regardless of the value, it's stored in the stream as two bytes in the instruction (as it has to be an address). In this case, it's stored as 01 A3. That's two bytes. Even if it was 00 00, it would still be two bytes. You get the high and low for the address.

When you read the value at that 16-bit location, what it *points* to is a byte.

Data values are 8 bits.
Addresses are 16 bits.

HegemonKhan
Well, the program "runs..." somewhat... I tried to debug through it (took a long long time, lots of steps, lol)... I ended up with '5' being the value used for the WriteConsoleA (display/output in/for the cmd prompt box), except it outputs a spade (like in playing cards)... also... it seemed to have failed when it got to, as it went past the FF... seemingly stuck forever/infinite loop (not sure why/how my program was able to output and seemingly not crash... or maybe it did use up all of the resources and just didn't give a crashing error message, meh):

what I'm guessing is suppose the last sequence of the program:

AA 05 00 07 FF

(machine.bin's indexes of: 0000 0090: 05 01 A2 11 02 04 22 05 02 AA 05 00 07 FF 00 00)

for reference of why I think the program ends above, aside from the FF (terminate), there's also the below:

(machine.bin's indexes of: 0000 00A0: 63 06 37 A6 16 84 CC 71 E5 5A CD 0B 0C 0D 0E oF)

-----------

I've no idea how to debug/troubleshoot whatever my issues are... it'd take too long for me to try to map it out by hand... and I wouldn't even know if I could even do it accurately... I never done debugging before, so I don't have any idea on how to use it purposefully... I watched the intel registers change, but couldn't figure out how to show each of the values in the virtual register array, and etc stuff that I'm clueless about doing/using.

I guess I could post all of my operations and/or full code, and maybe you could spot some mistakes that way... at this point I don't care too much, I've already come so vastly far from where I started at, "huuugggeee" (if I can use this saying by Presidential Candidate Donald Trump, it's becoming really populous now, lol, like how that stupid bud-weis-er frog advertising from long ago or the stupid 'wazzup' whatever beer advertising long ago) thanks to you Jay!

jaynabonne
That last sequence is

jnz R5, 0007 (jmp back to offset 7 if r5 is not zero)
halt

What do you do for a halt? I assume you have to break out of your loop somehow (the one I imagine you have).

Also, feel free to post, either here or to my email, if you wish.

HegemonKhan
well, I'm sure there's a ton of errors , not just at the end, as probably I got some bad algorithms for my instructions/operations and/or elsewhere code blunders. I still a little confused with the endian-ness for the other operations, especially if the converting is needed to be done or not, and/or how to do it in reverse (little endian to big endian). I tried based off of you helping me with the 'load' operation, though obviously got stuff wrong.

anyways, here's my program... hopefully the format will be preserved...

(sorry, I haven't got around yet to doing the commenting, lol)

(also, I had to paste the header/prototype file into my program, so it takes up a lot of my program, scroll down past it to the rest of my program's code)

(I'm sure I needed to use the arithmetic carry/borrow commands, add:adc/sub:sbb, but I don't understand how to work with carrying/borrowing digits yet)

(also, maybe I needed to use 'lea' too for some address-getting... hmm)

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

;redacted

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; This program's purpose is to emulate a CPU

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************
ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD



CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD


TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************
NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************
STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************
ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************
FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

MAX_RAM EQU 1024
INVALID_HANDLE_VALUE EQU -1
ERROR_ENUM EQU 1
READ_FILE_ERROR EQU 0
NULL_PTR EQU 0

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

ADD_ENUM EQU 11h
SUB_ENUM EQU 22h
XOR_ENUM EQU 44h
LOAD EQU 05h
LOAD_R EQU 55h
STORE EQU 06h
STORE_R EQU 66h
OUT_ENUM EQU 0CCh
JNZ_ENUM EQU 0AAh
HALT EQU 0FFh

;***********
; VARIABLES
;***********

heading byte "redacted", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to emulate a CPU", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

error_file_open byte "ERROR: Unable to open input file", \
CARRIAGE_RETURN, NEW_LINE_FEED

file_name byte "c:\redacted\machine.bin", 0

program_buffer byte MAX_RAM dup (0)

return_code dword 0

bytes_written dword 0
bytes_read dword 0
file_handle dword 0
file_size dword 0
handle_std_out dword 0
handle_std_in dword 0

RA byte 6 dup (0) ; RA = register array (R0-R5)

output_value dword ?

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

;*******************************
; Get handle to standard output
;*******************************

invoke GetStdHandle, STD_OUTPUT_HANDLE
mov handle_std_out, eax

;******************************
; Get handle to standard input
;******************************

invoke GetStdHandle, STD_INPUT_HANDLE
mov handle_std_in, eax

;********************************
; Open existing file for reading
;********************************

invoke CreateFileA, offset file_name, GENERIC_READ, FILE_SHARE_NONE, \
NULL_PTR, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL_PTR

cmp eax, INVALID_HANDLE_VALUE
je Open_Error
mov file_handle, eax

;*******************************************
; Determine the size of the file (in bytes)
;*******************************************

invoke GetFileSize, file_handle, NULL_PTR
mov file_size, eax

;****************************************
; Read the entire file into emulator RAM
;****************************************

invoke ReadFile, file_handle, offset program_buffer, file_size, \
offset bytes_read, NULL_PTR

cmp eax, READ_FILE_ERROR
je Finish

;****************
; Close the file
;****************

invoke CloseHandle, file_handle

Pre_Start:

invoke WriteConsoleA, handle_std_out, OFFSET heading, SIZEOF heading, \
OFFSET bytes_written, NULL_PTR

invoke WriteConsoleA, handle_std_out, OFFSET history, SIZEOF history, \
OFFSET bytes_written, NULL_PTR

invoke WriteConsoleA, handle_std_out, OFFSET purpose, SIZEOF purpose, \
OFFSET bytes_written, NULL_PTR

xor ebx, ebx

mov ecx, LENGTHOF program_buffer

Start:

sub ecx, 01h

cmp program_buffer[ebx], LOAD
je Load

cmp program_buffer[ebx], STORE
je Store

cmp program_buffer[ebx], ADD_ENUM
je Add_Enum

cmp program_buffer[ebx], SUB_ENUM
je Sub_Enum

cmp program_buffer[ebx], XOR_ENUM
je Xor_Enum

cmp program_buffer[ebx], LOAD_R
je Load_R

cmp program_buffer[ebx], STORE_R
je Store_R

cmp program_buffer[ebx], JNZ_ENUM
je Jnz_Enum

cmp program_buffer[ebx], OUT_ENUM
je Out_Enum

cmp program_buffer[ebx], HALT
je Finish

cmp ecx, 00h ; or its 01h
je Finish
add ebx, 01h
jne Start

Load:

add ebx, 01h
movzx eax, program_buffer[ebx]
add ebx, 01h
movzx edx, word ptr program_buffer[ebx]
xchg dh, dl
mov RA[eax], dl
add ebx, 02h

xor eax, eax
cmp eax, 00h
je Start

Store:

add ebx, 01h
movzx ax, RA[0]
xchg ah, al
mov word ptr program_buffer[ebx], ax
add ebx, 02h

xor eax, eax
cmp eax, 00h
je Start

Add_Enum:

add ebx, 01h
movzx edi, program_buffer[ebx]
movzx eax, RA[edi]
add ebx, 01h
movzx esi, program_buffer[ebx]
movzx edx, RA[esi]
add eax, edx
mov RA[edi], al
add ebx, 01h

xor eax, eax
cmp eax, 00h
je Start

Sub_Enum:

add ebx, 01h
movzx edi, program_buffer[ebx]
movzx eax, RA[edi]
add ebx, 01h
movzx esi, program_buffer[ebx]
movzx edx, RA[esi]
sub eax, edx
mov RA[edi], al
add ebx, 01h

xor eax, eax
cmp eax, 00h
je Start

Xor_Enum:

add ebx, 01h
movzx eax, program_buffer[ebx]
mov edi, eax
add ebx, 01h
movzx edx, program_buffer[ebx]
xor eax, edx
mov RA[edi], al
add ebx, 01h

xor eax, eax
cmp eax, 00h
je Start

Load_R:

add ebx, 01h
movzx eax, program_buffer[ebx]
movzx edi, RA[eax]
add ebx, 01h
movzx edx, program_buffer[ebx+edi]
xchg dh, dl
mov RA[eax], dl
add ebx, 02h

xor eax, eax
cmp eax, 00h
je Start

Store_R:

add ebx, 01h
movzx edx, RA[0]
xchg dh, dl
movzx eax, program_buffer[ebx]
movzx edi, RA[eax]
add ebx, 01h
mov word ptr program_buffer[ebx+edi], dx
add ebx, 02h

xor eax, eax
cmp eax, 00h
je Start

Jnz_Enum:

add ebx, 01h
movzx eax, program_buffer[ebx]
cmp eax, 00h
add ebx, 01h
jne Start
add ebx, 02h

xor eax, eax
cmp eax, 00h
je Start

Out_Enum:

add ebx, 01h
movzx eax, program_buffer[ebx]
movzx edx, RA[eax]

mov output_value, edx

invoke WriteConsoleA, handle_std_out, OFFSET output_value, \
SIZEOF output_value, OFFSET bytes_written, NULL_PTR

add ebx, 01h

xor eax, eax
cmp eax, 00h
je Start

Open_Error:

invoke WriteConsoleA, handle_std_out, offset error_file_open, \
sizeof error_file_open, offset bytes_written, NULL_PTR

mov return_code, ERROR_ENUM

Finish:

invoke ExitProcess, return_code

Main endp

END Main


and here's the machine.bin data file:

(err, that didn't work... let me add it as attachment... nevermind... can't do a bin file)

£D £ ¡DU DD¡DDU DDU DDf Df U DDU¤DÌ£¢"ªÿc7¦„ÌqåZÍ
 !"#$%&'()*+,-./012345689:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXY [\]^_`abdefghijklmnoprstuvwxyz{|}~€‚ƒ…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤¥§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊË
ÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäæçèéêëìíîïðñòóôõö÷øùúûüýþÿ®‹ü/©àÖèB$GX31ã÷Í©ÝeÐMj~Íß¹ÞöNýω%éGý¶5Šc"‡[RE#‹Ã©0
´ÐþÁòðqhíæÝ-
W"‡™L€×'-s'€/À)ž=1t„dˆ3r »üÖ.&¤\øTmÅâN£Aî8ù‚nÜÅzyÅÙt?¤>f6L¿´½š‚6@P•ô’½w¹¢¾Žz˜,Žn«Ê

jaynabonne
As I said - soo close. :) Just some minor things here and there, which you should be able to straighten out.

First off, it looks like you're using CX to count how many instructions you've executed, with an eye toward exiting when that reaches zero.. That's not right for a couple of reasons. First you set it to the number of bytes in your program but only decrement it if you don't know what an instruction is. So even if you wanted to do that, that's not the way. (If you want to make sure you don't walk off the end of your program buffer, then when you go to read your next opcode, see if ebx >= 1024 and quit if it is.) But that's really more of a tangent, though, because you don't want to count at all! Your program just needs to run, forever, until it hits a HALT. The program may loop 65000 times or 65000*65000 times or who knows how much. Your code just needs to keep executing until it is told to stop. Imagine the running program were to put up a prompt on screen and take input from the user. It might sit and loop for a long time waiting for a keystroke. And it would have to keep running until the user makes it exit. (That's a larger example than what you can probably do with the instructions you have, but I hope that makes sense).

Now to the individual command implementations:

Load:
I understand now why you were asking about the high/low part of DX. The code you have is this (with comments by me):

   add ebx, 01h    ; increment past the opcode
movzx eax, program_buffer[ebx] ; read the next byte to get the register
add ebx, 01h ; increment past the register byte
movzx edx, word ptr program_buffer[ebx] ; load the *address* of the byte to read from memory.
xchg dh, dl ; swap the bytes of the address, so you can use them as a 16-bit number.

Here's where it goes wrong. Right now, in DX you have the *address* of the byte you want to load. You need to actually go and get the byte from memory at that address, just as you have for the other bytes above.
 mov dl, program_buffer[edx]    ; <- add this line, which goes out to memory at the address in edx and gets the byte there
mov RA[eax], dl ; store the read byte in the register.


This code is just bizarre to me:

   xor eax, eax
cmp eax, 00h
je Start

First of all, doing an xor eax, eax sets the Z/E flag anyway, so you don't even need the compare. More than that, though, unless you actually need eax to be 0 at the start of the loop, you can just replace it with:

jmp Start


Store

The code you have is:


; increment past the opcode - rather than do this in every instruction, you could just do it before you start branching.
add ebx, 01h
; grab the value in R0
movzx ax, RA[0]
; you don't needs to do this. The value you care about is in AL (an 8-bit value). Swapping just confuses things.
xchg ah, al
; here you actually overwrite your own code in the program buffer with 16 bits of data
mov word ptr program_buffer[ebx], ax
; then you skip past the damage. You've actually stored a 16-bit version of R0 over the address you were meant to use.
add ebx, 02h

What you want to do is:


; increment past the opcode - rather than do this in every instruction, you could just do it before you start branching.
add ebx, 01h
; grab the value in R0 . Not sure why you don't just use mov al, RA[0], but maybe there's a reason.
movzx ax, RA[0]

;; Grab the address where you need to store the byte value
movzx edx, word ptr program_buffer[ebx]
;; swap it so you can use it
xchg dh, dl
add ebx, 02h
;; store the byte into the desired address
mov program_buffer[edx], al


Add, Sub, Xor
As far as I can tell, these are ok, apart from the things noted before (the xor stuff at the end, using "inc ebx" instead of "add ebx, 1", etc).

Load_R and Store_R
Both of these have the same problems mentioned before for Load and Store, about trying to do things with ebx instead of the address/offset within the instruction. For example, Load_R has:

movzx edx, program_buffer[ebx+edi]


and Store_R has

mov word ptr program_buffer[ebx+edi], dx


They both use ebx, which means you're doing things relative to the current instruction pointer, not the address provided with the instruction. You want to do things relative to the offset which the instruction pointer is pointing to. So you need to load the word (say) into edx that is pointed to by ebx, swap (etc) and then use that instead of ebx.

Jnz
This the interesting one. Think about what this means conceptually: you are stepping through a program with ebx as your current instruction pointer. The Jnz instruction will, if the specified register is not 0, *change* ebx to be the offset pointed to in the instruction. In other words, this instruction allows you to change where you read your next instruction from.

Let's look at how you could do that:

   add ebx, 01h           ; increment to register
movzx eax, program_buffer[ebx] ; load the byte register index
add ebx, 01h ; increment past register to address
movzx edx, word ptr program_buffer[ebx] ; load the address that will be jumped to if register is not 0
xchg dh, dl ; get it in usable form
add ebx, 2 ; skip past the addresss
;; now you're set to do something. You have the register and the address to jump to.
cmp byte ptr RA[eax], 0 ; is the register 0?
je Start ; go to next instruction
;; And here's the magic
mov ebx, edx ; set the instruction pointer to the address that was in the instruction
jmp Start ; go to next instruction


And the rest seems ok, I think, except you need to get rid of the ecx stuff. If it's an unknown instruction (and it shouldn't be - maybe you just want to jump to an error if it is, as you're off in the weeds somewhere?), then you can just increment the instruction pointer and jmp back to Start.

HegemonKhan
hmm... now my program seems to run forever, displaying:

(spade) (spade) (spade) ... etc etc etc

I think I followed it along successfully in the debugger... I think it gets to the 'jnz' and then resets the index back to zero, and that's as far as I've tracked it (not going to step through it again, over and over, forever, as my program seems to go on forever, laughs)

-------

"why didn't you use the, for example of eax, ax/ah/al sub-reg-segments, when you could have"

possibly, since its a 32 bit, it's most efficient to use the 32 bit registers over their subdivisions (for example for eax reg: ax, ah, al), but I'm probably totally mistaken on this, as I probably didn't understand the professor when I think he lectured on whatever it was that put this possible notion into my head, lol. I've been wrong on a lot of stuff, as I try to put together a correct understanding of all of these various aspects of assembly, laughs. I think there are some things that we can't do too, like with indirect addresing: [ax/ah/al], in the 80386+ (.386+) masm build(s)

As I learn assembly, I'll know better when to use larger sizes (eax/ax) vs the smaller sizes (ax/ah/al)

Efficiency is the least of my concerns at the moment... I'm glad if I can get it working correctly... later on I'll know more of all the various operations/instruction sets available, understand better about efficiency, and etc, but for now, I don't care too much about it (as long as the inefficiency of my program isn't so bad that my program runs way too slow of course, lol)

--------

I guess I can post my new program (as maybe I didn't quite understand/follow some of your directions for fixing up correctly the bad operation alogirthms I had), and maybe you can spot mistakes still in my operations. I tried changing the jmps to 'near ptr' jumps, just in case the offset/index distance exceeded that for 'jmp'. I'm using the 'case senstive' option, so not sure if I need to use caps or lower case for the commands (or if they overloaded them, so both upper or lower, work), however, in the deugging, it was jumping around fine, anyways.

the 'FF' (terminate) in the machine.bit file is offset: 9E h

I'm guessing this correlates to: ebx = 158 d/t ??? I'm not sure what type size ebx is increasing by... I'm adding 01h to it... but not sure if its increasing by a byte (which would correlate, I think, to the 158 d/t = 9E h) or a dword (ebx:32 bits = dword)

I could try to write out (correctly with no mistakes) the machine.bin data, if you even have a means (or want to - as this would mean running my program on an ide, like VS, and trying to troubleshoot it for me --- I'd really hate for you to do this though, so ignore it! The due date is less than 24 hrs away so I don't have time to see if the prof can help debug it for me, and regardless I'm so thankful/greatful that I've been able to come this far, all thanks to you, Jay) of using it.

probably the best thing is to just check over my operations, seeing if I got them right from what you can generally tell (if there's any glaring bad logic).

I'm still worried that I probably need to use the add/sub with the carry/borrowing (adc/sbb), as I'd think the add/sub arithmetic would involve additional digits, which requires the carrying/borrowing to get the right calculation, right?

I don't really know where to even begin with the debugging in relation to the program algorithm... is there any particular values/addresses that I can follow to tell what is happening correctly and what is not? (If only the prof could have provided a mapping of some or all of the steps/manipulations, so you could use that to figure out where the errors are in your program's algorithms/logic/bit manipulations, etc, sighs)

-----

I think I maybe should put back in the check on exceeding the program_buffer segment... so my program doesn't run forever... (as I doubt that is what the prof intended, along with nicely formatted columns of spade characters being displayed, lol).

how would I do the 'if/compare' syntax in assembly? I haven't learned how to do the 'if-like conditionals', I only know how to use 'cmp', which was why I had used the 'cmp ecx, 00 h' --- I thought I had it set up correctly, every time the program uses the 'start label/loop section' it decrements the counter, but this logic doesn't work correctly?

------

here's my new code:

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

;redacted

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; This program's purpose is to emulate a CPU

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

MAX_RAM EQU 1024
INVALID_HANDLE_VALUE EQU -1
ERROR_ENUM EQU 1
READ_FILE_ERROR EQU 0
NULL_PTR EQU 0

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

ADD_ENUM EQU 11h
SUB_ENUM EQU 22h
XOR_ENUM EQU 44h
LOAD EQU 05h
LOAD_R EQU 55h
STORE EQU 06h
STORE_R EQU 66h
OUT_ENUM EQU 0CCh
JNZ_ENUM EQU 0AAh
HALT EQU 0FFh

;***********
; VARIABLES
;***********

heading byte "redacted", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to emulate a CPU", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

error_file_open byte "ERROR: Unable to open input file", \
CARRIAGE_RETURN, NEW_LINE_FEED

file_name byte "c:\redacted\machine.bin", 0

program_buffer byte MAX_RAM dup (0)

return_code dword 0

bytes_written dword 0
bytes_read dword 0
file_handle dword 0
file_size dword 0
handle_std_out dword 0
handle_std_in dword 0

RA byte 6 dup (0) ; RA = register array (R0-R5)

output_value dword ?

coding_error byte "You have an error in your code at index: "

coding_error_index dword ?

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

;*******************************
; Get handle to standard output
;*******************************

invoke GetStdHandle, STD_OUTPUT_HANDLE
mov handle_std_out, eax

;******************************
; Get handle to standard input
;******************************

invoke GetStdHandle, STD_INPUT_HANDLE
mov handle_std_in, eax

;********************************
; Open existing file for reading
;********************************

invoke CreateFileA, offset file_name, GENERIC_READ, FILE_SHARE_NONE, \
NULL_PTR, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL_PTR

cmp eax, INVALID_HANDLE_VALUE
je Open_Error
mov file_handle, eax

;*******************************************
; Determine the size of the file (in bytes)
;*******************************************

invoke GetFileSize, file_handle, NULL_PTR
mov file_size, eax

;****************************************
; Read the entire file into emulator RAM
;****************************************

invoke ReadFile, file_handle, offset program_buffer, file_size, \
offset bytes_read, NULL_PTR

cmp eax, READ_FILE_ERROR
je Finish

;****************
; Close the file
;****************

invoke CloseHandle, file_handle

Pre_Start:

invoke WriteConsoleA, handle_std_out, OFFSET heading, SIZEOF heading, \
OFFSET bytes_written, NULL_PTR

invoke WriteConsoleA, handle_std_out, OFFSET history, SIZEOF history, \
OFFSET bytes_written, NULL_PTR

invoke WriteConsoleA, handle_std_out, OFFSET purpose, SIZEOF purpose, \
OFFSET bytes_written, NULL_PTR

xor ebx, ebx

Start:

cmp program_buffer[ebx], LOAD
je Load

cmp program_buffer[ebx], STORE
je Store

cmp program_buffer[ebx], ADD_ENUM
je Add_Enum

cmp program_buffer[ebx], SUB_ENUM
je Sub_Enum

cmp program_buffer[ebx], XOR_ENUM
je Xor_Enum

cmp program_buffer[ebx], LOAD_R
je Load_R

cmp program_buffer[ebx], STORE_R
je Store_R

cmp program_buffer[ebx], JNZ_ENUM
je Jnz_Enum

cmp program_buffer[ebx], OUT_ENUM
je Out_Enum

cmp program_buffer[ebx], HALT
je Finish

jmp near ptr Catch_Coding_Error

Load:

add ebx, 01h
movzx edi, program_buffer[ebx]

add ebx, 01h
movzx edx, word ptr program_buffer[ebx]
xchg dh, dl
mov al, program_buffer[edx]

mov RA[edi], al

add ebx, 02h
jmp near ptr Start

Store:

add ebx, 01h
movzx edx, word ptr program_buffer[ebx]
xchg dh, dl

movzx ax, RA[0]

mov program_buffer[edx], al

add ebx, 02h
jmp near ptr Start

Add_Enum:

add ebx, 01h
movzx edi, program_buffer[ebx]
movzx eax, RA[edi]

add ebx, 01h
movzx esi, program_buffer[ebx]
movzx edx, RA[esi]

add eax, edx

mov RA[edi], al

add ebx, 01h
jmp near ptr Start

Sub_Enum:

add ebx, 01h
movzx edi, program_buffer[ebx]
movzx eax, RA[edi]

add ebx, 01h
movzx esi, program_buffer[ebx]
movzx edx, RA[esi]

sub eax, edx

mov RA[edi], al

add ebx, 01h
jmp near ptr Start

Xor_Enum:

add ebx, 01h
movzx eax, program_buffer[ebx]
mov edi, eax

add ebx, 01h
movzx edx, program_buffer[ebx]

xor eax, edx

mov RA[edi], al

add ebx, 01h
jmp near ptr Start

Load_R:

add ebx, 01h
movzx esi, program_buffer[ebx]
movzx edi, RA[esi]

add ebx, 01h
movzx edx, word ptr program_buffer[ebx]
xchg dh, dl
movzx eax, program_buffer[edx+edi]

mov RA[esi], al

add ebx, 02h
jmp near ptr Start

Store_R:

add ebx, 01h
movzx esi, program_buffer[ebx]
movzx edi, RA[esi]

movzx eax, RA[0]

add ebx, 01h
movzx edx, word ptr program_buffer[ebx]
xchg dh, dl

mov word ptr program_buffer[edx+edi], ax

add ebx, 02h
jmp near ptr Start

Jnz_Enum:

add ebx, 01h
movzx eax, program_buffer[ebx]
movzx esi, RA[eax]

add ebx, 01h
movzx edx, program_buffer[ebx]
xchg dh, dl

add ebx, 02h

cmp esi, dword ptr 00h
je Start
mov ebx, edx
jmp near ptr Start

Out_Enum:

add ebx, 01h
movzx eax, program_buffer[ebx]
movzx edx, RA[eax]

mov output_value, edx

invoke WriteConsoleA, handle_std_out, OFFSET output_value, \
SIZEOF output_value, OFFSET bytes_written, NULL_PTR

add ebx, 01h
jmp near ptr Start

Catch_Coding_Error:

invoke WriteConsoleA, handle_std_out, OFFSET coding_error, \
SIZEOF coding_error, OFFSET bytes_written, NULL_PTR

mov coding_error_index, ebx

invoke WriteConsoleA, handle_std_out, OFFSET coding_error_index, \
SIZEOF coding_error_index, OFFSET bytes_written, NULL_PTR

jmp near ptr Finish

Open_Error:

invoke WriteConsoleA, handle_std_out, offset error_file_open, \
sizeof error_file_open, offset bytes_written, NULL_PTR

mov return_code, ERROR_ENUM

Finish:

invoke ExitProcess, return_code

Main endp

END Main


I've no idea if my 'catch_coding_error' works or not, as I guess it always was able to do an operation... (unless my catch_coding_error doesn't work)

jaynabonne
I'm looking through it now, The only thing I see so far is that I was wrong. The XOR handler is not right. Your add and sub ones look to be ok, though, so if you make the XOR one be like those, that will fix that. (You're xor'ing the register indices, not the register contents themselves!)

Still looking...

jaynabonne
Another thought: do your cmp's need to be like

cmp byte ptr program_buffer[ebx], LOAD

?
I don't know what the default size for a cmp is. What you could do to help debug it is to do an OUT sort of thing for each instruction. For example, output "L" when you hit a load, etc. Then you could see what's happening better.

HegemonKhan
with the xor, do I:

reg1+reg2 = "reg3" and then use "reg3" as the index for getting the value at it (which is then stored into the register) ??

I'm a bit unclear of understanding your post (and unclear of the needed xor logic) on my xor issue...

--------

when I do the stepping in the debugger... it seems to be correctly jumping to the instructions:

start, load, start, xor, start, load, start, add, start, xor, start, store, etc...

(whether it's doing the right manipulation or instruction or using the right values/addresses, is of course another matter entirely, which would be daunting in trying to debug... as far as I can tell)

------

the debugger's stepping does a pretty good job of what operation it's doing... the problem is with such as when it gets an address or value, is that the right address value, is it doing the correct operation's details/specifics or not...

maybe the prof will provide a map of what is suppose to be doing (operation order and the values/addresses at each of the steps), or I can take his working solution code, and try to use it to figure out what was going wrong with my code. So, afterwards, I can hopefully be able to learn what was/is going wrong with my program.

jaynabonne
You do the same as in the add and sub, just use the xor instead of add and sub. :) So you could grab the code for sub, copy and paste it to xor and then change the "sub" to "xor". You want to xor the values (not the indices - so RA[index]) of reg1 and reg2 and store that result in reg1.

jaynabonne
Another problem: in Store_R, you only want to write a byte value. So make it:

mov byte ptr program_buffer[edx+edi], al

HegemonKhan
I added some stuff (responded to some of your questions) to my previous post.

-----

ah, I see with the xor, I'm tired... laughs, sorry about not seeing why the xor was wrong.

jaynabonne

the debugger's stepping does a pretty good job of what operation it's doing... the problem is with such as when it gets an address or value, is that the right address value, is it doing the correct operation's details/specifics or not...


I had thought about a debugger. I'm glad you have one. I think the main thing is whether you'd be able to do yourself (on paper) what the code is supposed to be doing: step byte by byte through the bin file and "execute" the code. It might be a bit late for that, but it would help you to be sure you understand what's supposed to be happening, by trying it out by hand yourself.

Did the above fixes have any effect?

HegemonKhan
haven't tested it yet, give me a min...

ya, except I'm not certain I could/would be able to map it out correctly myself... if I can't get the right mapping of it by hand, then I can't use it to check my program, lol. I'm also still a bit shakey with the number systems, as I've not used hexidecimals/binary at all until this assembly class, so don't have much practice with them (too much other homework, not much time to study in more detail for my classes, sighs).

(and even if I could, it'd probably take me a long time to go through it... the machine.bin has an offset of 9E, or 10 rows * 16 columns, for the first FF:terminate, which I'm not particularly excited about doing... that's a lot of manipulation/calculation work for me, if I could even do it correctly, lol)

if I had more time, I'd definately try doing it... not sure if I could do it correctly of course though... (this was something that I had thought about trying to do, but never had the time to get around to it, as I was busy trying to just get to where I am now, thanks again to your help)

HegemonKhan
hmm... I get different characters/symbols now... except it is infinite loop... and crashed/froze my computer when I tried to quit it, lol

some possible "progress", I think... if you could call that progress, laughs.

it might be almost "working" now (not that I have any idea of how its suppose to work/look like, lol)... except it has something wrong causing it to skip the FF (terminate) or it is keeping jumping back to index, the problem is in finding what error(s) are causing it to skip the FF or to get it to not jump back to index 0.... hmm...

jaynabonne
Ok, I see one more problem. When you're doing an "OUT", it's doing a console write, which takes characters. I'm not sure if in the assignment it says to output as characters or as numbers. (For example, if you have the byte value 0x41, do you output it as text "65" (decimal) or "41" (hex) or as the ASCII character 'A'?)

But right now, you're taking a byte value and shoving it into a DWORD output_value and then printing all four bytes of that. It's possible that the console write will hit the null (0) in that buffer and not print those, but it doesn't seem quite right.

If you're supposed to be printing the byte as an ASCII character, then I'd make output_value a byte and just store the byte there for printing.

If you're supposed to be outputting it as a number, then you're going to need to convert the hex value from the register into characters to print, in some format. (Hex output is easier, I think, as it's always two characters to output - 0x45 would just be "45" - but to format a byte as decimal only takes up to three digits, so it's just a bit more fiddly.)

jaynabonne
Also, if you were able to send me the .bin file (e.g. email) I might be able to figure out what it's supposed to be doing... :)

HegemonKhan
We don't have any instruction on what is suppose to be outputted... except the given 'out' instruction details:

OUT, CCh, reg1, --, send value in reg1 to out (screen)

"your program should use the windows API WriteConsole to output to the screen"

--------

You'd think that the program would output some phrase/word (such as "hello world" or "isn't assembly fun ... sarcasm") in terms of just using simple "common sense", but that's just a guess, of course. Maybe it outputs some sequence of numbers instead, or it could be anything.

-------

well, regardless of what it's suppose to output, the problem is that the program never terminates correctly.

----------

I don't want you to try to debug/troubleshoot it, you've done already so much, I don't want you try do the debugging for me. That's too much, and you've already done so much, too much already. You helped me nearly fully understand the program and assembly concepts so well, compared to where I started at. I just need to learn how to debug well on my own.

Let me, maybe after I complete my symester, I can do the debugging myself and see what I was doing wrong, and if I still can't then, then I'll think about whether to get your help on the debugging.

At the very least, I want to try/make an attempt, at doing the debugging myself, seeing if I can do it by hand, and I'll have the solution code and/or mapping from the prof to help me as well with the debugging.

-------

it's already 1 am here, and I have to submit my program by 5 pm. I don't think I have the time to try to go through it by hand, as I should at least put in all the commenting I've not yet done, as the commenting is important especially with assemby, and the prof obviously tries to emphasize commenting-good documentation. I'm fortunate enough to have gotten this far, thank to you. So, I think this has been more than enough help from you on learning this program.

-----

Let me see if I can learn what I was doing wrong on my own from here.

Thank you so very much, Jay!

jaynabonne
I wasn't so much going to debug (I wasn't planning on using your code, for example) as much as 1) make sure *I* understand what's supposed to be happening, that the way the opcodes is defined in the assignment matches the bin file, and 2) give you a clue about what's supposed to be output.

As you say, it seems it would output a message, and I was curious what that was. :)

But I can also understand if you don't want to go there.

HegemonKhan
Even if it wouldn't be too much work for you to do that, I already gotten more than enough help from you. I'm quite glad to have gotten to where thanks to you, I've gotten to already (compared to being totally clueless and wrong at the start, lol).

If you're that curious for your own interests, I can email the bin file sometime after my class tomarrow.

I don't care about getting the assignment done/points, I just want to learn assembly, and you've greatly helped me with that. It's more important that I actually learn to program, than stupid school points/grades. I'm paying the school to get educated, so I can actually get a job, I'm not paying to get meaningless school points/grades. Jobs don't care about meaningless school points/grades, they care if you can program, if you can fix or solve their code, or prevent or find and stop malicious attack in terms of networking and etc. I'm already old, the most important thing is to actually learn the programming. I'm not going to any big prestigious university, I'm just trying to get towards a degree and job as quickly as I can, before it's too late, and I'm stuck doing 24/7 minimum paid jobs, flipping hamburgers and whatever, the rest of my life... ~ Half of the entire U.S. population is unemployed... and maybe half of the employed half have minimum paid jobs, and only part time employment at that (in total poverty), thanks to obama the RETARD. Ya, the U.S. economy/labor force (well the citisen/legal/american/U.S. labor force) is just "rockin man"... (sarcasm). Ya, economic/labor recovery, "my (exploitive) !"

jaynabonne
I get where you're coming from, and I think you have the right idea - when I was interviewing people for a company I used to work for, having a PhD even didn't matter as much as whether they could actually program.

If you end up not getting satisfaction and understanding from your instructor when this is all said and done, let me know, and we'll make sure you understand what happened or should have happened - and why - by going through things.

HegemonKhan
well... I decided to start on trying to learn this debugging on my own... this is what I've done so far...

(I need to create a program to do this stuff for me, laughs - it took me like at least 3 hrs to do this stuff, wasn't fun!, :x)
(jokingly, I think I got carpal tunnel syndrome from doing this stuff, just joking, thankfully)

-------------

The "machine.bin" file (hopefully without typos/mistakes):

0000  0000  05 04 01 A3  44 02 02 05  00 01 A0 05  04 01 A3 11  ....  D...  ....  ....
0000 0010 00 04 06 01 A0 05 03 01 A1 44 01 01 11 01 00 55 .... .... .D.. ...U
0000 0020 01 00 A0 11 03 01 44 05 05 11 05 00 44 00 00 11 .... ..D. .... D...
0000 0030 00 03 06 01 1A 44 00 00 11 00 05 44 01 01 11 01 .... .D.. ...D ....
0000 0040 03 55 01 00 A0 44 05 05 11 05 01 44 01 01 11 01 .U.. .D.. ...D ....
0000 0050 00 55 01 00 A0 44 04 04 11 04 00 44 00 00 11 00 .U.. .D.. ...D ....
0000 0060 01 66 03 00 A0 44 00 00 11 00 05 66 04 00 A0 11 .f.. .D.. ...f ....
0000 0070 01 05 55 01 00 A0 44 04 04 11 04 01 44 01 01 11 ..U. ..D. .... D...
0000 0080 01 02 55 01 01 A4 44 01 04 CC 01 05 04 01 A3 05 ..U. ..D. .... ....
0000 0090 05 01 A2 11 02 04 22 05 02 AA 05 00 07 FF 00 00 .... ..". .... ....
0000 00A0 63 06 37 A6 16 84 CC 71 E5 5A CD 0B 0C 0D 0E 0F c.7. ...q .Z.. ....
0000 00B0 10 11 12 13 14 15 04 17 18 19 1A 1B 1C 1D 1E 1F .... .... .... ....
0000 00C0 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F !"# $%&' ()*+ ,-./
0000 00D0 30 31 32 33 34 35 36 02 38 39 3A 3B 3C 3D 3E 3F 0123 456. 89:; <=>?
0000 00E0 40 41 42 43 44 45 46 47 48 49 4A 4B 4C 4D 4E 4F @ABC DEFG HIJK LMNO
0000 00F0 50 51 52 53 54 55 56 57 58 59 09 5B 5C 5D 5E 5F PQRS TUVW XY.[ \]^_
0000 0100 60 61 62 63 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F `ab. defg hijk lmno
0000 0110 70 07 72 73 74 75 76 77 78 79 7A 7B 7C 7D 7E 7F p.rs tuvw xyz{ |}~.
0000 0120 80 81 82 83 84 85 86 87 88 89 8A 8B 8C 8D 8E 8F .... .... .... ....
0000 0130 90 91 92 93 94 95 96 97 98 99 9A 9B 9C 9D 9E 9F .... .... .... ....
0000 0140 A0 A1 A2 A3 A4 A5 03 A7 A8 A9 AA AB AC AD AE AF .... .... .... ....
0000 0150 B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 BA BB BC BD BE BF .... .... .... ....
0000 0160 C0 C1 C2 C3 C4 C5 C6 C7 C8 C9 CA CB 01 0A CE CF .... .... .... ....
0000 0170 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 DA DB DC DD DE DF .... .... .... ....
0000 0180 E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF .... .... .... ....
0000 0190 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF .... .... .... ....
0000 01A0 00 00 AE 01 8B 1B FC 2F A9 E0 11 D6 E8 42 24 47 .... .../ .... .B$G
0000 01B0 58 33 31 12 00 E3 F7 CD A9 DD 65 D0 17 4D 6A 7E X31. .... ..e. .MJ~
0000 01C0 1F CD DF B9 DE F6 03 4E 1C FD CF 89 25 E9 47 FD .... ...N .... %.G.
0000 01D0 B6 35 01 8A 63 22 87 5B 52 0F 45 23 8B C3 A9 30 .5.. c".[ R.E# ...0
0000 01E0 0D B4 D0 FE C1 F2 00 F0 71 68 ED E6 15 04 DD 19 .... .... qh.. ....
0000 01F0 2D 0A 9D 57 22 87 99 4C 80 18 D7 27 2D 73 27 80 -..w "..L ...' -s'.
0000 0200 2F C0 29 9E 8F 3D 31 1D 74 84 64 88 33 1D 72 20 /.). .=1. t.d. 3.r
0000 0210 BB FC D6 2E 26 A4 16 5C F8 54 6D C5 E2 4E A3 41 .... &..\ .Tm. .N.A
0000 0220 EE 12 38 1A F9 82 6E DC C5 7A 79 02 C5 D9 74 3F ..8. ..n. .zy. ..t?
0000 0230 A4 3E 66 36 4C BF B4 BD 9A 82 36 40 90 50 95 F4 .>f6 L... ..6@ .P..
0000 0240 92 BD 77 B9 17 A2 BE 8E 1B 7A 98 2C 1E 8E 16 6E ..W. .... z.., ...N
0000 0250 AB CA ..


and the first step I'm doing (still working on it), is mapping out the machine.bin operations (I'm now ready to try to fill in the details of the operations for what I'll do next - taking a short-medium break at the moment, lol):

(it's the machine.bin formatted vertically, so I can write in the details of each step/operation... and once that is done, hopefully correctly, I can then use it to compare with my program via debugging)

(again, hopefully without typos/mistakes, lol)

0000  0000  05  .  LOAD
0000 0001 04 . REG4
0000 0002 01 . 01
0000 0003 A3 . A3

0000 0004 44 D XOR
0000 0005 02 . REG2
0000 0006 02 . REG2
0000 0007 05 . LOAD

0000 0008 00 . REG0
0000 0009 01 . 01
0000 000A A0 . A0
0000 000B 05 . LOAD

0000 000C 04 . REG4
0000 000D 01 . 01
0000 000E A3 . A3
0000 000F 11 . ADD

--------------------

0000 0010 00 . REG0
0000 0011 04 . REG4
0000 0012 06 . STORE
0000 0013 01 . 01

0000 0014 A0 . A0
0000 0015 05 . LOAD
0000 0016 03 . REG3
0000 0017 01 . 01

0000 0018 A1 . A1
0000 0019 44 D XOR
0000 001A 01 . REG1
0000 001B 01 . REG1

0000 001C 11 . ADD
0000 001D 01 . REG1
0000 001E 00 . REG0
0000 001F 55 U LOADR

---------------------

0000 0020 01 . REG1
0000 0021 00 . 00
0000 0022 A0 . A0
0000 0023 11 . ADD

0000 0024 03 . REG3
0000 0025 01 . REG1
0000 0026 44 D XOR
0000 0027 05 . REG5

0000 0028 05 . REG5
0000 0029 11 . ADD
0000 002A 05 . REG5
0000 002B 00 . REG0

0000 002C 44 D XOR
0000 002D 00 . REG0
0000 002E 00 . REG0
0000 002F 11 . ADD

-----------------------

0000 0030 00 . REG0
0000 0031 03 . REG3
0000 0032 06 . STORE
0000 0033 01 . 01

0000 0034 1A . 1A
0000 0035 44 D XOR
0000 0036 00 . REG0
0000 0037 00 . REG0

0000 0038 11 . ADD
0000 0039 00 . REG0
0000 003A 05 . REG5
0000 003B 44 D XOR

0000 003C 01 . REG1
0000 003D 01 . REG1
0000 003E 11 . ADD
0000 003F 01 . REG1

----------------------

0000 0040 03 . REG3
0000 0041 55 U LOADR
0000 0042 01 . REG1
0000 0043 00 . 00

0000 0044 A0 . A0
0000 0045 44 D XOR
0000 0046 05 . REG5
0000 0047 05 . REG5

0000 0048 11 . ADD
0000 0049 05 . REG5
0000 004A 01 . REG1
0000 004B 44 D XOR

0000 004C 01 . REG1
0000 004D 01 . REG1
0000 004E 11 . ADD
0000 004F 01 . REG1

-----------------------

0000 0050 00 . REG0
0000 0051 55 U LOADR
0000 0052 01 . REG1
0000 0053 00 . 00

0000 0054 A0 . A0
0000 0055 44 D XOR
0000 0056 04 . REG4
0000 0057 04 . REG4

0000 0058 11 . ADD
0000 0059 04 . REG4
0000 005A 00 . REG0
0000 005B 44 D XOR

0000 005C 00 . REG0
0000 005D 00 . REG0
0000 005E 11 . ADD
0000 005F 00 . REG0

----------------------

0000 0060 01 . REG1
0000 0061 66 f STORER
0000 0062 03 . REG3
0000 0063 00 . 00

0000 0064 A0 . A0
0000 0065 44 D XOR
0000 0066 00 . REG0
0000 0067 00 . REG0

0000 0068 11 . ADD
0000 0069 00 . REG0
0000 006A 05 . REG5
0000 006B 66 f STORER

0000 006C 04 . REG4
0000 006D 00 . 00
0000 006E A0 . A0
0000 006F 11 . ADD

----------------------

0000 0070 01 . REG1
0000 0071 05 . REG5
0000 0072 55 U LOADR
0000 0073 01 . REG1

0000 0074 00 . 00
0000 0075 A0 . A0
0000 0076 44 D XOR
0000 0077 04 . REG4

0000 0078 04 . REG4
0000 0079 11 . ADD
0000 007A 04 . REG4
0000 007B 01 . REG1

0000 007C 44 D XOR
0000 007D 01 . REG1
0000 007E 01 . REG1
0000 007F 11 . ADD

-----------------------

0000 0080 01 . REG1
0000 0081 02 . REG2
0000 0082 55 U LOADR
0000 0083 01 . RG1

0000 0084 01 . 01
0000 0085 A4 . A4
0000 0086 44 D XOR
0000 0087 01 . REG1

0000 0088 04 . REG4
0000 0089 CC . OUT
0000 008A 01 . REG1
0000 008B 05 . LOAD

0000 008C 04 . REG2
0000 008D 01 . 01
0000 008E A3 . A3
0000 008F 05 . LOAD

---------------------

0000 0090 05 . REG5
0000 0091 01 . 01
0000 0092 A2 . A2
0000 0093 11 . ADD

0000 0094 02 . REG2
0000 0095 04 . REG4
0000 0096 22 " SUB
0000 0097 05 . REG5

0000 0098 02 . REG2
0000 0099 AA . JNZ
0000 009A 05 . REG5
0000 009B 00 . 00

0000 009C 07 . 07
0000 009D FF . HALT
0000 009E 00 .
0000 009F 00 .

------------------------

0000 00A0 63 c
0000 00A1 06 .
0000 00A2 37 7
0000 00A3 A6 .

0000 00A4 16 .
0000 00A5 84 .
0000 00A6 CC .
0000 00A7 71 q

0000 00A8 E5 .
0000 00A9 5A Z
0000 00AA CD .
0000 00AB 0B .

0000 00AC 0C .
0000 00AD 0D .
0000 00AE 0E .
0000 00AF 0F .

------------------------

0000 00B0 10 .
0000 00B1 11 .
0000 00B2 12 .
0000 00B3 13 .

0000 00B4 14 .
0000 00B5 15 .
0000 00B6 04 .
0000 00B7 17 .

0000 00B8 18 .
0000 00B9 19 .
0000 00BA 1A .
0000 00BB 1B .

0000 00BC 1C .
0000 00BD 1D .
0000 00BE 1E .
0000 00BF 1F .

----------------------------

0000 00C0 20
0000 00C1 21 !
0000 00C2 22 "
0000 00C3 23 #

0000 00C4 24 $
0000 00C5 25 %
0000 00C6 26 &
0000 00C7 27 '

0000 00C8 28 (
0000 00C9 29 )
0000 00CA 2A *
0000 00CB 2B +

0000 00CC 2C ,
0000 00CD 2D -
0000 00CE 2E .
0000 00CF 2F /

-----------------

0000 00D0 30 0
0000 00D1 31 1
0000 00D2 32 2
0000 00D3 33 3

0000 00D4 34 4
0000 00D5 35 5
0000 00D6 36 6
0000 00D7 02 .

0000 00D8 38 8
0000 00D9 39 9
0000 00DA 3A :
0000 00DB 3B ;

0000 00DC 3C <
0000 00DD 3D =
0000 00DE 3E >
0000 00DF 3F ?

--------------------

0000 00E0 40 @
0000 00E1 41 A
0000 00E2 42 B
0000 00E3 43 C

0000 00E4 44 D
0000 00E5 45 E
0000 00E6 46 F
0000 00E7 47 G

0000 00E8 48 H
0000 00E9 49 I
0000 00EA 4A J
0000 00EB 4B K

0000 00EC 4C L
0000 00ED 4D M
0000 00EE 4E N
0000 00EF 4F O

---------------------

0000 00F0 50 P
0000 00F1 51 Q
0000 00F2 52 R
0000 00F3 53 S

0000 00F4 54 T
0000 00F5 55 U
0000 00F6 56 V
0000 00F7 57 W

0000 00F8 58 X
0000 00F9 59 Y
0000 00FA 09 .
0000 00FB 5B [

0000 00FC 5C \
0000 00FD 5D ]
0000 00FE 5E ^
0000 00FF 5F _

-----------------------

0000 0100 60 `
0000 0101 61 a
0000 0102 62 b
0000 0103 63 .

0000 0104 64 d
0000 0105 65 e
0000 0106 66 f
0000 0107 67 g

0000 0108 68 h
0000 0109 69 i
0000 010A 6A j
0000 010B 6B k

0000 010C 6C l
0000 010D 6D m
0000 010E 6E n
0000 010F 6F o

----------------------

0000 0110 70 p
0000 0111 07 .
0000 0112 72 r
0000 0113 73 s

0000 0114 74 t
0000 0115 75 u
0000 0116 76 v
0000 0117 77 w

0000 0118 78 x
0000 0119 79 y
0000 011A 7A z
0000 011B 7B {

0000 011C 7C |
0000 011D 7D }
0000 011E 7E ~
0000 011F 7F .

-----------------------

0000 0120 80 .
0000 0121 81 .
0000 0122 82 .
0000 0123 83 .

0000 0124 84 .
0000 0125 85 .
0000 0126 86 .
0000 0127 87 .

0000 0128 88 .
0000 0129 89 .
0000 012A 8A .
0000 012B 8B .

0000 012C 8C .
0000 012D 8D .
0000 012E 8E .
0000 012F 8F .

------------------

0000 0130 90 .
0000 0131 91 .
0000 0132 92 .
0000 0133 93 .

0000 0134 94 .
0000 0135 95 .
0000 0136 96 .
0000 0137 97 .

0000 0138 98 .
0000 0139 99 .
0000 013A 9A .
0000 013B 9B .

0000 013C 9C .
0000 013D 9D .
0000 013E 9E .
0000 013F 9F .

----------------------

0000 0140 A0 .
0000 0141 A1 .
0000 0142 A2 .
0000 0143 A3 .

0000 0144 A4 .
0000 0145 A5 .
0000 0146 03 .
0000 0147 A7 .

0000 0148 A8 .
0000 0149 A9 .
0000 014A AA .
0000 014B AB .

0000 014C AC .
0000 014D AD .
0000 014E AE .
0000 014F AF .

-----------------------

0000 0150 B0 .
0000 0151 B1 .
0000 0152 B2 .
0000 0153 B3 .

0000 0154 B4 .
0000 0155 B5 .
0000 0156 B6 .
0000 0157 B7 .

0000 0158 B8 .
0000 0159 B9 .
0000 015A BA .
0000 015B BB .

0000 015C BC .
0000 015D BD .
0000 015E BE .
0000 015F BF .

------------------

0000 0160 C0 .
0000 0161 C1 .
0000 0162 C2 .
0000 0163 C3 .

0000 0164 C4 .
0000 0165 C5 .
0000 0166 C6 .
0000 0167 C7 .

0000 0168 C8 .
0000 0169 C9 .
0000 016A CA .
0000 016B CB .

0000 016C 01 .
0000 016D 0A .
0000 016E CE .
0000 016F CF .

----------------------

0000 0170 D0 .
0000 0171 D1 .
0000 0172 D2 .
0000 0173 D3 .

0000 0174 D4 .
0000 0175 D5 .
0000 0176 D6 .
0000 0177 D7 .

0000 0178 D8 .
0000 0179 D9 .
0000 017A DA .
0000 017B DB .

0000 017C DC .
0000 017D DD .
0000 017E DE .
0000 017F DF .

-----------------------

0000 0180 E0 .
0000 0181 E1 .
0000 0182 E2 .
0000 0183 E3 .

0000 0184 E4 .
0000 0185 E5 .
0000 0186 E6 .
0000 0187 E7 .

0000 0188 E8 .
0000 0189 E9 .
0000 018A EA .
0000 018B EB .

0000 018C EC .
0000 018D ED .
0000 018E EE .
0000 018F EF .

------------------

0000 0190 F0 .
0000 0191 F1 .
0000 0192 F2 .
0000 0193 F3 .

0000 0194 F4 .
0000 0195 F5 .
0000 0196 F6 .
0000 0197 F7 .

0000 0198 F8 .
0000 0199 F9 .
0000 019A FA .
0000 019B FB .

0000 019C FC .
0000 019D FD .
0000 019E FE .
0000 019F FF .

----------------------

0000 01A0 00 .
0000 01A1 00 .
0000 01A2 AE .
0000 01A3 01 .

0000 01A4 8B .
0000 01A5 1B .
0000 01A6 FC .
0000 01A7 2F /

0000 01A8 A9 .
0000 01A9 E0 .
0000 01AA 11 .
0000 01AB D6 .

0000 01AC E8 .
0000 01AD 42 B
0000 01AE 24 $
0000 01AF 47 G

-----------------------

0000 01B0 58 X
0000 01B1 33 3
0000 01B2 31 1
0000 01B3 12 .

0000 01B4 00 .
0000 01B5 E3 .
0000 01B6 F7 .
0000 01B7 CD .

0000 01B8 A9 .
0000 01B9 DD .
0000 01BA 65 e
0000 01BB D0 .

0000 01BC 17 .
0000 01BD 4D M
0000 01BE 6A J
0000 01BF 7E ~

------------------

0000 01C0 1F .
0000 01C1 CD .
0000 01C2 DF .
0000 01C3 B9 .

0000 01C4 DE .
0000 01C5 F6 .
0000 01C6 03 .
0000 01C7 4E N

0000 01C8 1C .
0000 01C9 FD .
0000 01CA CF .
0000 01CB 89 .

0000 01CC 25 %
0000 01CD E9 .
0000 01CE 47 G
0000 01CF FD .

----------------------

0000 01D0 B6 .
0000 01D1 35 5
0000 01D2 01 .
0000 01D3 8A .

0000 01D4 63 c
0000 01D5 22 "
0000 01D6 87 .
0000 01D7 5B [

0000 01D8 52 R
0000 01D9 0F .
0000 01DA 45 E
0000 01DB 23 #

0000 01DC 8B .
0000 01DD C3 .
0000 01DE A9 .
0000 01DF 30 0

-----------------------

0000 01E0 0D .
0000 01E1 B4 .
0000 01E2 D0 .
0000 01E3 FE .

0000 01E4 C1 .
0000 01E5 F2 .
0000 01E6 00 .
0000 01E7 F0 .

0000 01E8 71 q
0000 01E9 68 h
0000 01EA ED .
0000 01EB E6 .

0000 01EC 15 .
0000 01ED 04 .
0000 01EE DD .
0000 01EF 19 .

------------------

0000 01F0 2D -
0000 01F1 0A .
0000 01F2 9D .
0000 01F3 57 w

0000 01F4 22 "
0000 01F5 87 .
0000 01F6 99 .
0000 01F7 4C L

0000 01F8 80 .
0000 01F9 18 .
0000 01FA D7 .
0000 01FB 27 '

0000 01FC 2D -
0000 01FD 73 s
0000 01FE 27 '
0000 01FF 80 .

----------------------

0000 0200 2F /
0000 0201 C0 .
0000 0202 29 )
0000 0203 9E .

0000 0204 8F .
0000 0205 3D =
0000 0206 31 1
0000 0207 1D .

0000 0208 74 t
0000 0209 84 .
0000 020A 64 d
0000 020B 88 .

0000 020C 33 3
0000 020D 1D .
0000 020E 72 r
0000 020F 20

-----------------------

0000 0210 BB .
0000 0211 FC .
0000 0212 D6 .
0000 0213 2E .

0000 0214 26 &
0000 0215 A4 .
0000 0216 16 .
0000 0217 5C \

0000 0218 F8 .
0000 0219 54 T
0000 021A 6D m
0000 021B C5 .

0000 021C E2 .
0000 021D 4E N
0000 021E A3 .
0000 021F 41 A

------------------

0000 0220 EE .
0000 0221 12 .
0000 0222 38 8
0000 0223 1A .

0000 0224 F9 .
0000 0225 82 .
0000 0226 6E n
0000 0227 DC .

0000 0228 C5 .
0000 0229 7A z
0000 022A 79 y
0000 022B 02 .

0000 022C C5 .
0000 022D D9 .
0000 022E 74 t
0000 022F 3F ?

-----------------------

0000 0230 A4 .
0000 0231 3E >
0000 0232 66 f
0000 0233 36 6

0000 0234 4C L
0000 0235 BF .
0000 0236 B4 .
0000 0237 BD .

0000 0238 9A .
0000 0239 82 .
0000 023A 36 6
0000 023B 40 @

0000 023C 90 .
0000 023D 50 P
0000 023E 95 .
0000 023F F4 .

------------------

0000 0240 92 .
0000 0241 BD .
0000 0242 77 W
0000 0243 B9 .

0000 0244 17 .
0000 0245 A2 .
0000 0246 BE .
0000 0247 8E .

0000 0248 1B z
0000 0249 7A .
0000 024A 98 .
0000 024B 2C ,

0000 024C 1E .
0000 024D 8E .
0000 024E 16 .
0000 024F 6E N

----------------

0000 0250 AB .
0000 0251 CA .
0000 0252
0000 0253

0000 0254
0000 0255
0000 0256
0000 0257

0000 0258
0000 0259
0000 025A
0000 025B

0000 025C
0000 025D
0000 025E
0000 025F

jaynabonne
Looks like a good plan! And keep in mind that there won't just be code in the .bin file. There will be data as well. For example, the first instruction is reading from offset 0x01a3, so you can be fairly sure that that will be a data byte and not a code byte.

So basically you shouldn't have to decode the entire file as instructions, assuming the code doesn't jump all over the place. You might only have to go as far as the HALT, if all the jnz's jump backwards (if that makes sense).

Edit: in fact, from around offset 0xab onward, it definitely looks like data for a while. :) (increasing integers from 0x0b to 0xff).

jaynabonne
Did you create the machine.bin file above by hand? If so, there might be a typo (if not, it's just odd). The 1A at 0x34 might be A1.

(BTW, that's some convoluted code there...)

jaynabonne
Here's something to contemplate while looking through the code, as it might help to simplify it in your mind (since it does this a lot).

What is the effect of pairs of instructions like this?

xor r2, r2
add r2, r1

HegemonKhan
jaynabonne wrote:Did you create the machine.bin file above by hand? If so, there might be a typo (if not, it's just odd). The 1A at 0x34 might be A1.

(BTW, that's some convoluted code there...)


ya, that's a typo (thanks for spotting it), it is indeed suppose to be (it is): A1

-----

yes, I did it by hand, lol. at least 3 hrs to do jsut that, one of the first things when I got the time, is to write a program to do this for me, laughs. This program will eventually turn into an assembly debugging/deciphering program... hopefully. HK laughs/grins evilly, I'll have a program that will do/show how a bin file is suppose to run correctly (sometime in the future when I capable of taking it that far)

-----------

I'm pretty methodical and often tri-check everything, so there's probably not too many typos because of it, though can't be perfect, grr

--------

jaynabonne wrote:Here's something to contemplate while looking through the code, as it might help to simplify it in your mind (since it does this a lot).

What is the effect of pairs of instructions like this?

xor r2, r2
add r2, r1


the xor zeros the (first:left) register (if both registers are the same), and then by adding a value to it, you're "setting" it to that new value.

the xor x1, x2 zeroes it (due to xor truth table logic) and if you xor it again, it returns to its value (zeros when done odd number of times, sets/re-sets it again when done even number of times, thus xor is a "bit toggle", off-on-off-on: zero-set-zero-set)

---------------------


jaynabonne wrote:Looks like a good plan! And keep in mind that there won't just be code in the .bin file. There will be data as well. For example, the first instruction is reading from offset 0x01a3, so you can be fairly sure that that will be a data byte and not a code byte.

So basically you shouldn't have to decode the entire file as instructions, assuming the code doesn't jump all over the place. You might only have to go as far as the HALT, if all the jnz's jump backwards (if that makes sense).

Edit: in fact, from around offset 0xab onward, it definitely looks like data for a while. :) (increasing integers from 0x0b to 0xff).


I think the main difficulty/issue will be with following (accurately) the adding and subtracting, especially with the addressing '[eax+ebx]' stuff (which is likely for jumping into the data segment to get or set a value there), and in figuring out why its not/never terminating...

---

ah, so the bin file is just like assembly code, with a "code segment" and a "data segment".

so, the upper fourth is the code segment with the opcodes and opdata and then on the far right is any of the flags involved with those operations

than the data segment with the character/symbol values

some empty segment: extra segment, heap/free segment, etc ???

and I guess the bottom (4th) segment is another data segment ???

do the double quotes in the far right column, make the data between them a logical/virtual array segment ??? (like with networking how you can logically/virtually make subnetworks/groups, even though the physical architecture is completely different)

" .... c.... 7.... q..... Z....!..."

"...[....R....E....#.....0.......q......h.....-......w...."

jaynabonne
If it's any consolation, I did a quick mockup of a parser in Javascript, and I can't get anything but gibberish when I run this "bin". I'll see if I can figure out why. It doesn't hang though...

the xor zeros the (first:left) register (if both registers are the same), and then by adding a value to it, you're "setting" it to that new value.



Yep. So those two instructions are how this processor does a "mov".

What it looks like top me is that the data from 0x1a4 on is an encoded message, and the data from 0x00a0 on is a table used to do the decoding.

HegemonKhan
I didn't make that realization/conenction (that it's a mov), laughs, thanks for pointing it out!

so this assignment is somewhat having us decrypt an encrypted message?

jaynabonne
It looks like it. I'm wondering if there could be some signed/unsigned whackiness going on.

Ok, that was it (to some extent). I needed to make sure my 8-bit values stayed 8-bit. I get this now:

'You either disassembled the code + rewrote it, wrote an interpretter or tied your brain in knots, NÕsñžúó¼‹ü}kQø,´nD¬o‡²õü¤2§ÃÙâ÷äµ°

You'd think someone teaching a course would be able to spell "interpreter". (Sorry, couldn't resist.)

There might still be a problem, given that the bytes after the comma are garbage.

jaynabonne
To be honest, I suspect a typo in the data somewhere. Once the stream gets wrong, it will stay wrong.

In case you're interested, here's the Javascript code. (I assume since you're giving out the bin file that the class is done.)

    var runner = {
execute: function(data) {
var ip = 0;
var r = [0,0,0,0,0,0];

var readByte = function() {
return data[ip++];
};
var readWord = function() {
return readByte()*256 + readByte();
};

var s = "";
var done = false;
var reg1, reg2, address;

while (!done) {
var opcode = readByte();
switch (opcode) {
case 0x11:
// ADD reg1 reg2 (reg1 = reg1 + reg2)
reg1 = readByte();
reg2 = readByte();
r[reg1] = (r[reg1] + r[reg2]) & 0x00ff;
break;

case 0x22:
// SUB reg1, reg2 (reg1 = reg1 - reg2
reg1 = readByte();
reg2 = readByte();
r[reg1] = (r[reg1] - r[reg2]) & 0x00ff;
break;

case 0x44:
// XOR reg1, reg2 (reg1 = reg1 ^ reg2
reg1 = readByte();
reg2 = readByte();
r[reg1] = r[reg1] ^ r[reg2];
break;
case 0x05:
// LOAD reg1, address (reg1 = [address]
reg1 = readByte();
address = readWord();
r[reg1] = data[address];
break;
case 0x55:
// LOADR reg1, address (reg1 = [address+reg1]
reg1 = readByte();
address = readWord();
r[reg1] = data[address + r[reg1]];
break;
case 0x06:
// STORE address ([address] = r0)
address = readWord();
data[address] = r[0];
break;
case 0x66:
// STORER reg1, address ([address+reg1] = r0)
reg1 = readByte();
address = readWord();
data[address + r[reg1]] = r[0];
break;
case 0xcc:
// OUT reg1 (output character in reg1)
reg1 = readByte();
s += String.fromCharCode(r[reg1]);
break;
case 0xaa:
// JNZ reg1, address
reg1 = readByte();
address = readWord();
if (r[reg1] !== 0)
ip = address;
break;
case 0xff:
done = true;
break;
}
}
return s;
}
};

var bin = [
0x05,0x04,0x01,0xA3,0x44,0x02,0x02,0x05,0x00,0x01,0xA0,0x05,0x04,0x01,0xA3,0x11,
0x00,0x04,0x06,0x01,0xA0,0x05,0x03,0x01,0xA1,0x44,0x01,0x01,0x11,0x01,0x00,0x55,
0x01,0x00,0xA0,0x11,0x03,0x01,0x44,0x05,0x05,0x11,0x05,0x00,0x44,0x00,0x00,0x11,
0x00,0x03,0x06,0x01,0xA1,0x44,0x00,0x00,0x11,0x00,0x05,0x44,0x01,0x01,0x11,0x01,
0x03,0x55,0x01,0x00,0xA0,0x44,0x05,0x05,0x11,0x05,0x01,0x44,0x01,0x01,0x11,0x01,
0x00,0x55,0x01,0x00,0xA0,0x44,0x04,0x04,0x11,0x04,0x00,0x44,0x00,0x00,0x11,0x00,
0x01,0x66,0x03,0x00,0xA0,0x44,0x00,0x00,0x11,0x00,0x05,0x66,0x04,0x00,0xA0,0x11,
0x01,0x05,0x55,0x01,0x00,0xA0,0x44,0x04,0x04,0x11,0x04,0x01,0x44,0x01,0x01,0x11,
0x01,0x02,0x55,0x01,0x01,0xA4,0x44,0x01,0x04,0xCC,0x01,0x05,0x04,0x01,0xA3,0x05,
0x05,0x01,0xA2,0x11,0x02,0x04,0x22,0x05,0x02,0xAA,0x05,0x00,0x07,0xFF,0x00,0x00,
0x63,0x06,0x37,0xA6,0x16,0x84,0xCC,0x71,0xE5,0x5A,0xCD,0x0B,0x0C,0x0D,0x0E,0x0F,
0x10,0x11,0x12,0x13,0x14,0x15,0x04,0x17,0x18,0x19,0x1A,0x1B,0x1C,0x1D,0x1E,0x1F,
0x20,0x21,0x22,0x23,0x24,0x25,0x26,0x27,0x28,0x29,0x2A,0x2B,0x2C,0x2D,0x2E,0x2F,
0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x02,0x38,0x39,0x3A,0x3B,0x3C,0x3D,0x3E,0x3F,
0x40,0x41,0x42,0x43,0x44,0x45,0x46,0x47,0x48,0x49,0x4A,0x4B,0x4C,0x4D,0x4E,0x4F,
0x50,0x51,0x52,0x53,0x54,0x55,0x56,0x57,0x58,0x59,0x09,0x5B,0x5C,0x5D,0x5E,0x5F,
0x60,0x61,0x62,0x63,0x64,0x65,0x66,0x67,0x68,0x69,0x6A,0x6B,0x6C,0x6D,0x6E,0x6F,
0x70,0x07,0x72,0x73,0x74,0x75,0x76,0x77,0x78,0x79,0x7A,0x7B,0x7C,0x7D,0x7E,0x7F,
0x80,0x81,0x82,0x83,0x84,0x85,0x86,0x87,0x88,0x89,0x8A,0x8B,0x8C,0x8D,0x8E,0x8F,
0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9A,0x9B,0x9C,0x9D,0x9E,0x9F,
0xA0,0xA1,0xA2,0xA3,0xA4,0xA5,0x03,0xA7,0xA8,0xA9,0xAA,0xAB,0xAC,0xAD,0xAE,0xAF,
0xB0,0xB1,0xB2,0xB3,0xB4,0xB5,0xB6,0xB7,0xB8,0xB9,0xBA,0xBB,0xBC,0xBD,0xBE,0xBF,
0xC0,0xC1,0xC2,0xC3,0xC4,0xC5,0xC6,0xC7,0xC8,0xC9,0xCA,0xCB,0x01,0x0A,0xCE,0xCF,
0xD0,0xD1,0xD2,0xD3,0xD4,0xD5,0xD6,0xD7,0xD8,0xD9,0xDA,0xDB,0xDC,0xDD,0xDE,0xDF,
0xE0,0xE1,0xE2,0xE3,0xE4,0xE5,0xE6,0xE7,0xE8,0xE9,0xEA,0xEB,0xEC,0xED,0xEE,0xEF,
0xF0,0xF1,0xF2,0xF3,0xF4,0xF5,0xF6,0xF7,0xF8,0xF9,0xFA,0xFB,0xFC,0xFD,0xFE,0xFF,
0x00,0x00,0xAE,0x01,0x8B,0x1B,0xFC,0x2F,0xA9,0xE0,0x11,0xD6,0xE8,0x42,0x24,0x47,
0x58,0x33,0x31,0x12,0x00,0xE3,0xF7,0xCD,0xA9,0xDD,0x65,0xD0,0x17,0x4D,0x6A,0x7E,
0x1F,0xCD,0xDF,0xB9,0xDE,0xF6,0x03,0x4E,0x1C,0xFD,0xCF,0x89,0x25,0xE9,0x47,0xFD,
0xB6,0x35,0x01,0x8A,0x63,0x22,0x87,0x5B,0x52,0x0F,0x45,0x23,0x8B,0xC3,0xA9,0x30,
0x0D,0xB4,0xD0,0xFE,0xC1,0xF2,0x00,0xF0,0x71,0x68,0xED,0xE6,0x15,0x04,0xDD,0x19,
0x2D,0x0A,0x9D,0x57,0x22,0x87,0x99,0x4C,0x80,0x18,0xD7,0x27,0x2D,0x73,0x27,0x80,
0x2F,0xC0,0x29,0x9E,0x8F,0x3D,0x31,0x1D,0x74,0x84,0x64,0x88,0x33,0x1D,0x72,0x20,
0xBB,0xFC,0xD6,0x2E,0x26,0xA4,0x16,0x5C,0xF8,0x54,0x6D,0xC5,0xE2,0x4E,0xA3,0x41,
0xEE,0x12,0x38,0x1A,0xF9,0x82,0x6E,0xDC,0xC5,0x7A,0x79,0x02,0xC5,0xD9,0x74,0x3F,
0xA4,0x3E,0x66,0x36,0x4C,0xBF,0xB4,0xBD,0x9A,0x82,0x36,0x40,0x90,0x50,0x95,0xF4,
0x92,0xBD,0x77,0xB9,0x17,0xA2,0xBE,0x8E,0x1B,0x7A,0x98,0x2C,0x1E,0x8E,0x16,0x6E,
0xAB,0xCA];

var result = runner.execute(bin);

console.log(result);


HegemonKhan
hmm... so I didn't deal with signed/unsigned such as with the add/sub operations?

(also, I can try to scour for the typos in my hand written bin file vs the real bin file, not sure if I can find them though)

-------

I found a typo:

0000 0103: 63 -> 00

I put in 63, but it was suppose to be 00:

60 61 62 00 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F

jaynabonne
I don't know if you had, since your registers were 8-bit anyway. But when I was simulating it in Javascript, adding two numbers could result in a value being larger than 0x255, which would stay that large as opposed to having extra bits dropped when shoved into a real 8-bit register,

For example, if you have an 8-bit register with value 0xff, and you add 1 to it, it wraps back to 0. But in my code, it would become 256 (0x100) since Javascript variables aren't limited to 8 bits. So I had to put some "and 0x00ff"'s in there to clamp it.

But I don't think your code would have that problem. It was just me. It is something to keep in mind, though.

jaynabonne

(also, I can try to scour for the typos in my hand written bin file vs the real bin file, not sure if I can find them though)



You could also email me the bin file, if it would be easier. :) I could hex dump it and send it back to you.

jaynabonne
I just found another error in the latest code you posted.

In the jnz_enum handler, you need this:

movzx edx, program_buffer[ebx]


to be this:

movzx edx, word ptr program_buffer[ebx]


Otherwise, it will only read the 8-bit value into edx instead of a 16-bit value. That's the danger of something like movzx, where it's "overloaded" with different flavors, and you have to be careful which one you use.

That would actually explain why it loops forever - the address was 00 07, and it would grab the 00 byte instead of the 0007 word, jump back to the beginning, and reset the R2 loop index to 0.

HegemonKhan
it doesn't take too long (halfway through them already), I just have to make sure I don't miss any typos...

hmm.. then where's my mistake in my operations/code, grr..

your answer suggests that it's quite a lot of work to track/map through it, as it does several resets to zero (it has to get a lot of characters and start over each time)... it'd take me some time to go through every step... which I'm going to do, but that's a lot of steps to find all the values, address, and movements, and etc... to then try to find where the issue is with my my program isn't working... fun fun...

HegemonKhan
oops... a simple mistake I overlooked/missed in my code, thanks for spotting it.

I just posted a typo I found in the previous post of mine or a few posts back...

0000 0103: 63 -> 00

I put in 63, but it is suppose to be: 00

60 61 62 00 64 65 66 67 68 69 6A 6B 6C 6D 6E 6F

jaynabonne
That's better. Now I get this:

'You either disassembled the code + rewrote it, wrote an interpretter or tied your brain in knots, whichever way, well done! The ke&"•~/1Š-çÁadwÔʳúüLë‹Õ2¶v/"Ï hG!Š½ô„'

So probably still something somewhere.

Be sure to check out my code correction above. I bet if you fix that (plus the other corrections I gave you), your assembly might actually work!

HegemonKhan
oh wow, wow, wow, it works, it works, it works !!!!

(I just can't bloody read the font on the command line prompt box, laughs...)

You either disassembled the code + rewrote it, wrote an interpretter or tied your brain in knots, whichever way, well done! The key you rebuilt is 'eMuL8T0r!'. Chilliwilli.

I'm not sure on that last part... lol (the font is really hard for me to read...)

-----

bloody simple mistake on my part with not having that last 'word ptr'... and I thought it was something more complex/deep with my operation algorithm logic/manipulation/calculations, laughs.

jaynabonne
Woo hoo! Congrats. :)

HegemonKhan
All thanks to you! :D

I have to thank you 2^16 times, and I think I only thanked you ~ 30 times so far ;)

--------

my (hand) re-write of the machine.bin, probably still have some typos in it, preventing it from working for you

(I'll email the file to you a bit later on, as I got to do some of my other homework I've been putting off due to working on this assembly lab, as it's easier that way, no typos)

jaynabonne
lol. No problem. I'm glad we got there in the end. You had the vast majority of it right. I think if you had more time, you would have worked out the few minor issues.

HegemonKhan
Sorry about this Jay, but I'm stumped again on our next assignment:

(it's due this wed., the 16th, so only got a short window left on it to get any possible help, as I've been trying to do it on my own, sighs)

evaluating/parsing a mathematical equation

using this (user inputted) equation syntax:

number(space)operator(space)number

the 'numbers' (operands 1+2) are from a single digit char to signed dwords (bit 32:sign + 2^31:value), aka: 1 to 11 characters, range: -(2^31) to +(2^31), -2147483648 to +2147483648

operator: +, -, *, /

two required functions (procedures):

ASCII to Decimal :: args: esi=source array address, ecx=source array size :: decimal value return=eax :: preserve other regs

Decimal to ASCII :: generated ascii characters needed (including the sign) to display decimal value return=ecx :: args: reg=destination array address, ecx=max size of destination array :: preserve other regs

also, we don't have to verify that the input is valid; all input will be valid

--------

after 3 days of trying to do just the first part (ascii to decimal function) myself, in mostly total failure (a lot of it was jsut trying to figure out what exactly was the logic design that the two required functions were asking of me, and how to do their design), I found this:

http://www.winasm.net/forum/index.php?showtopic=2724

and I think I partly understand it, but not fully... if you could explain it to me, I'd be appreciative. In my 3 days of trying to figure out the logic on my own... I was actually on the right track... except... I was trying to figure out how to do the bit shifting without doing bit shifting, lol. I was hoping I could put the decimals into an array, and then byte ptr move that array into eax, lol. I tried... sighs.

I'm not clear on how the flags work (carry), especially in relation to the arithmetic operations, and whether I need to do 'neg eax', if my operand(s) are negative, or whether I actually use/put the decimal value of the '-' char into, or if I add in a '1' to, the highest/most significant (32/31) bit of eax, instead.

I don't understand the bit shifting and/or bit arithmetic, nor its carrying/borrowing, nor the carry flag and nor its role, at all, sighs.

-------

my undertanding:

1. the person's (the link above) code (the ascii to decimal function) will ret an operand's decimal value, meaning that I need to get the size/length of the operand for it to do so, meaning further that I need to do this twice, once for each operand. the user input array will be max length of 27: 11~operand1 + 3~space~operator~space + 11 operand2 + carriagereturn~newlinefeed)

2. not sure how I handle the operator in terms of whether I need to get its decimal value or just use a comparison for what operation to do

3. handle the possible operations, divide by zero error, prompt user of under/over flow

4. for the decimal to ascii function, do I need to actually convert from the decimal values back into the ascii or will it automatically display correctly the ascii from the decimal values? Also, I believe that I need to make a new array (of max length: 11~operand1 + 3~space~operator~space + 11~operand2 + 3~space~equals~space + 11:result~answer)

jaynabonne
A few immediate thoughts:

1) I wouldn't bother with all the bit shifting. It's an optimization to use (n << 3) + (n << 1) instead of simply multiplying the stupid thing by 10. :) Back in "the old days", you would resort to such tricks for performance reasons, but I doubt you need to worry about that for this assignment. Just know that if you were using this a lot (e.g. parsing XML SOAP requests on a heavily hit server) it might matter. But you don't need that for your work.

2) As well, the carry check is to see if the addition has overflowed (that is, the number is too large to fit in 32-bits). If you're guaranteed to have valid numbers, you don't need to worry about that either. (I do see you mention overflow, but that will probably happen more when you do the actual operation, not when you're parsing the numbers.)

3) You will need to neg eax (or whatever you're accumulating in) if the initial character is a '-'. Simply setting the high bit would not work - you need to take the real two's-complement negative or it won't be right.

The basic idea behind converting ASCII to a number is to take the digits one by one, from left to right in this case. As you take each digit, multiply what you have so far by 10 (move it up a numeric "slot") and then add the ASCII digit - '0' (the ASCII for 0) to form the next part of the sum. You do need to check the initial character for a '-' sign, and you need to know when you have run out of digits for the first number by looking for a non-numeric (your operation character, for example).

For converting back to ASCII, the easiest way to process it is to generate the digits backwards or use recursion (e.g. to get digits for n, if n > 10, then get digits for n/10 and add n mod 10 as a digit to the result. Iteratively is more or less the same thing.) The problem with working it iteratively backwards is the digits come out in reverse order... Recursion solves that problem since you dive down to the first digit before you work your way back up the stack, adding the digits in order. We can discuss that further if you like. :) The alternative to working it backwards is to work out the largest power of 10 that the number has (e.g. for 12589, your initial divisor would be 10000), and then use that for your divisor, dividing your divisor by 10 each time until you get to 1. I don't have that details of that algorithm in my head at the moment, but it could be worked out.

I'm leaving work now for home, but let me know what you think so far.

HegemonKhan
I found that my prof also provided sample code for 'ascii to bcd', in which his code also checks for ( and then does the proper "bit placing/accumulating' ) if the array is even or odd in length. Do I need to utilize this for my code? (though, I'm not able to follow/understand his code to well, I'm a bit clue-less on understanding all of this, so I'd probably have trouble implementing his code design into my attempted code design at this stuff)

I'm still a bit confused on some specifics with the ascii to decimal conversion, using the online person's code (the link I provided in previous post):

********
edit:
I understand that I can just do 'x10', but I'd need that explained to me in how it's working as well, so since I think I'm somewhat understanding the bit shifting, maybe we should just stay with using that (and it is optimized code design that we should be using regardless of if it actually makes a difference or not in this case), and explain how this bit shifting and carrying works, as I just need to understand these specifics to understand it, whereas if we were to use the 'x10', I'd probably need a lot more explaining of how it works and how to properly apply it... well, whatever, maybe you should decide which will be easier for you to use, to help me, and I'll just go with that.
********

the bit shifting is initially using '0' (=30h), and then the next char-number is the first char-number of the array, correct? Am I right to presume that this is to get the size of the shifting/multiplying (aka: 0-9 -> decimal -> x10 for digit shifting) ??? So, do I want to do the same for my code, or do I want to take whatever the array's first 0-9 char-number is and use that for my initial bit shifting, instead of '0' (30h), or do I want to actually take the sign (if there is a sign), and use that as my first char-number??? Also, aside of whether I apply it first or not, do/am I even suppose to use the sign within this entire "bit-placing/accumulating" proceedure at all or not? Also, If I'm not applying the sign into the eax (this is the holding array for the bit parsing / accumulation), how do I handle the highest bit not being used (bit 32) by the bit accumulation/parsing, is this why the person's code is using '0' (30h), or do I just use the first char-number of the array for the highest bit or should I skip past the highest bit and start with the second highest bit (bit 31) ?

also when doing the bit shifting... do I need to do anything with the carry, as if I do the 'shl 3' correctly, the first char-number gets its high bit pushed into the carry slot, correct ??? Do I need to add it back into the highest bit, or does it do that for me ??? Also, when I do the shift again for the next char-number, don't I completely lose my first char-number (and so for for each additional char-number added) ??? I'm not really understanding how this 'bit placement/accumulating' is working, presumably correctly (or do I need to do anything with it, like adding back in the carry slot's values, or having to shift it back 3 to the right, etc etc etc) ... ???

--------

I'll probably need some help with figuring out how to do the recursion or iteration for converting it back into the ascii... but I'll try on my own... if I even get that far... sighs

------

all of this, different number system and bit using, arithmetic is totally new to me, so I'm a bit lost, depsite that I probably should be understanding it, I did get to calculus, but that was many years ago... I'm finding myself just not as smart in math as I was back then, sighs. I guess those brain cells are dying off, as I'm getting older and older, laughs-sighs. Also, back then, none of my math classes ever covered using different number systems and bits manipulation/arithmetic. No computer science word problems, lol.

jaynabonne
What might help is to break it down to simple cases. Let's work through a few, and see if it becomes clear at all.

First case is a single digit.

Imagine that someone enters the number "9" in the string as ASCII. You'd want that to show up as the value 9 in your register when all is said and done. But the value you get when you read that character from the buffer actually has value 57 (0x39). The number zero has ASCII value 48. (See this table, if it helps: http://ascii.cl/ ). So in order to get your '9' converted to the number 9, you need to subtract the value '0' (ASCII 48) from your '9' to give you the value 9 (57-48 = 9). You will need to do that with each digit you read as you read it. That's why the code you linked to is subtracting '0' from the value before adding it to the buffer.

So that is straightforward. To convert from ASCII to a value for a single digit, you subtract '0' (the ASCII value for the character '0').

Consider now the case where someone types the number "35". The first character you read from the buffer will be '3'. You subtract '0' from that to get 3. If that were the last digit, you'd be done (as before). But looking forward, you see the next character is another digit. How do you deal with it?

You deal with it as you would numbers in general. When *you* look at the number "35" you think "thirty-five" or "three tens and five", or 3x10 + 5. Similarly, "732" would be seven hundreds, three tens and 2 ones: 7x10x10 + 3x10 + 2. As you go up in digits, they are multiplied by another 10.

As before, assume you've read the '3' and have the value 3 in your register. Now you read the '5'. What do you do with it? Well, first of all, it's clear now that the '3' you had before was actually 3 tens and what you have next is 5 ones (so far). So in order to make room for the new digit coming in, you need to multiply the current value by 10: 3 -> 30. Then you subtract '0' from '5' to get 5, and add that in: 30 + 5 -> 35.

If you were then to find more digits, you'd continue to do the same thing: multiply by ten to "shift" the digit left. Then add in the next one. If the characters were "781", your register would have successively values 7, 78 and then 781, as you encounter each digit.

'7' -> 7
'8' -> 7x10 + 8 = 78
'1' -> 78x10 + 1 = 781

It looks like the first digit is special, but it isn't. If you prime your register with 0, then you can do the same in all cases:

Initialize to 0
'7' -> 0x10 + 7 = 7
'8' -> 7x10 + 8 = 78
'1' -> 78x10 + 1 = 781

That's the basic idea for reading an ASCII string of digits and converting it to a number: for each digit, multiply your current accumulator by 10 and then add in the the normalized (value - '0') digit.

Now, you can do shifts if you want. But I personally would just do something like "imul eax,10" and be done with it.

Also, you shouldn't see a carry unless you overflow a 32-bit number. You can decide if you want to deal with that (it would be some sort of error condition), but if the teacher said the numbers will always be legal, then you can dispense with that.

jaynabonne
As far as converting from a number back to ASCII, keep the basic trick in mind: given your current number N, then N mod 10 is the next lowest digit of the number (e.g. 783 mod 10 is 3). And N div 10 is the remaining digits (e.g. 783 / 10 = 78). Multiplying by 10 shifts a digit left. Dividing by 10 shifts a digit right. And to convert from the digit back to ASCII, you have to add '0' back on before putting it in the buffer to turn it into a printable ASCII digit.

Where do you get mod (also known as "the remainder")? Look at the "div" instruction:

https://pdos.csail.mit.edu/6.828/2009/r ... 86/DIV.htm

It gives you both at once!

Size    Dividend     Divisor   Quotient   Remainder
byte AX r/m8 AL AH
word DX:AX r/m16 AX DX
dword EDX:EAX r/m32 EAX EDX

jaynabonne
If you follow the above for converting back, then it's easy to generate the number in reverse.

while N >= 10
store next digit: (N mod 10) + '0'
N = N /10
store last digit: (N mod 10) + '0'


However, if your number is 9462, you'd get digits '2', '6, '4' and '9, in that order. If you know how many digits you'll be generating up front (D), then you could start D bytes into the buffer and just work your way backwards. You can do that by dividing by 10 until you hit a number less than 10, counting how many divides you do. (e.g. 15689 would divide by 10 four times before you're left with 1, so you would start 4 bytes into the buffer and work backwards).

The recursive approach is more interesting, but let the above all digest first (and I'm off to bed). Let me know which bits are still unclear, at least about the first part. That part should be fairly straightforward. Going back to ASCII is the trickier part.

HegemonKhan
quick question, for indirect addressing...

array address: esi
array length: ecx

does ' esi[-ecx] ' reference index 0; Is esi[-ecx] == esi[0]; and is esi[ - (ecx-1: pretend ecx got dec/sub) ] == esi[1]; eci[ - (ecx - 2) ] == esi[2], ???

as, should I use this method above (if it works) or just increment 'ebx' (from 0: xor ebx, ebx) ???
(as if I can use ecx, then I can reference from the beginning or from the end: more versatile, compared to incrementing ebx from 0: ya, I could get the ending values too when using ebx, but I think it'd take at least 1 more operation...)

----------

also, should I use the string operations (I don't know if you're familiar with them-don't know what version they got added: lods, stos, movs, cmps, etc: err... do I use byte/word/dword ???) or not (I don't need to use them, I can use the normal operations/instructions) ???

------------

ah, so with the (left: x10) shifting (ascii to decimal), I shouldn't be getting any carry, and also the first char-number shouldn't be pushed into bit 32 (or do I use '0' as the first char-number so that bit 32 gets '0', and then afterword, I do the neg if neded, ???), ???

btw, does it matter if I use hexidecimal ('0' = 30h), or do I have to use decimal ('0' = 48t); is '01h' the same as '1t' for the bit working/shifting (48 t = 0011 0000 y = 30 h) ???

input: -1234567
eax: 0000 0000

1: 31h - 30h = 1h
eax: 0000 0001
x10 (my brain can't process this in terms of bit/binary shifting, lol) / shl 3
eax: 0000 1000
```+0011 0000
eax: 0011 1000
```+0011 0000
eax: 0110 1000
is this correct ?

2: 32h - 30h = 2
eax: 0110 1010
x10 (my brain can't process x10 in terms of bit/binary shifting, lol), so I'm using instead: shl 3
`` 0 1101 0100
`` 1 1010 1000
`` 1 0101 0000
eax: 0101 0000
```+0011 0000
eax: 1000 0000
```+0011 0000
eax: 1011 0000

I'm lost... argh... is this correct? or am I completely lost?

jaynabonne
To be honest, I would not be concerned with bits and shifting. Or even binary. It's only going to confuse things. That's I was suggesting just sticking with multiplying by 10, because *it's what the algorithm is doing*, as opposed to the bit shifty fiddliness which is just trying to do an optimal x10 anyway. You would have the exact same algorithm even if you were writing it (say) in Javascript - or Quest! But we can go there a little if you wish.

Your internal digits you store without the ASCII-ness, So when you get your first digit, You want eax in the end to be the actual value of the number.

So you would start out with eax as 0.

First digit, 1:

imul eax, 10 00000000 (still 0)
get next byte from array into ebx (movzx etc). ebx = '1'
sub bl, '0' ebx = 1 (a character will always fit in the low byte, so you can just sub on bl)
add eax, ebx 00000001 (after first digit, value is 1)

Second digit, 2:

imul eax, 10 00001010 (now 10, or 0ah)
get next byte from array into ebx (movzx etc). ebx = '2'
sub bl, '0' ebx = 2
add eax, ebx 00001100 (after second digit, value is 12)

Third digit, 3:

imul eax, 10 00010100 (now 36, or 024h)
get next byte from array into ebx (movzx etc). ebx = '3'
sub bl, '0' ebx = 3
add eax, ebx 00010111 (after third digit, value is 39)

You're not accumulating the 30h stuff in there. You're stripping it out. Just x10 and then add (the digit-'0'). And you can use any sort of number notation you wish. 32 va 020h is the same number, just specified differently. It doesn't matter internally. That's why using '0' makes more sense in this case, because it actually has to do with characters. as opposed to a sort of magic number like 48 or 030h.

As far as your first question goes, I don't know. But that's an interesting idea. Try it out! :)

The lods instruction (at least how I knew it) is just shorthand for loading and incrementing. So lodsb is just

mov al, [esi]
add esi, 1

lodsw is just:

mov ax, [esi]
add esi, 2

etc. So it's up to you if you want to use it or not. Again, that's really more of an optimization, but if you happen to have things set up where esi points to what you want and you want to load the value into al and have esi incremented when you're done, then it might fit.

jaynabonne
And what I would do is get it to work for simple cases, and then extend it. For example, just get it working parsing positive numbers as a first step. Then once that's working, make the minor changes needed to handle negative numbers. If you incrementally evolve your code rather than trying to wrap your head around the entire algorithm at once, it can become much easier. Find the next incremental step and do it:

1) make it work for a single digit. Get the structure of the algorithm in place.
2) make it work for two digits, but keep it working for one. That requires you to check for the second digit conditionally (an "if") and to merge them together.
3) convert your "if" into a loop to handle all digits. At this point you can handle positive numbers.
4) add the special code to check the first character for a '-'. If so, remember that state, skip the character and do what you did before for a positive number, but then coming back at the end to negate the result.

(This, by the way, is the essence of TDD, except you do the above in the context of writing tests, so you can be sure you don't break previous behavior as you add new behavior.).

HegemonKhan
question about using:

EQU (equates ~ enumeration, as I'm trying to get into the habit of not use 'magic numbers' ~ obviously for human usage having lots of these "variables:equates/enums" are good, so long as your program can handle the mem usage of using/creating all of these "variables", is this the general philosophy / best practice ??? Is it preferred to use equates even when they'd be static, non-dynamic, anyways, or shold you just plug in the literal values into all the individual code lines ??? for example of a static equate like NULL_PTR, as it's always 0, why not just use 0, why waste your mem / stack space ??? I know that by using equates you're commenting about what they represent, but besides this human reason, is there any reason to use an equate if it's just going to be static, instead of the literal values ???)

is there a difference with using them in terms of setting them to: decimal or hexidecimal ???

for example:

NULL_PTR EQU 0 t/d
vs
NULL_PTR EQU 00 h
vs
ASCII_ZERO EQU = 48 t/d
vs
ASCII_ZERO EQU = 30 h
vs
NULL_PTR EQU 48 t/d
vs
NULL_PTR EQU 30 h
vs
ASCII_ZERO EQU = 0 t/d
vs
ASCII_ZERO EQU = 00 h

I'm confused with when I need to use "ASCII_values" vs "actual_numeric_value/digits"

for example, for 'ExitProcess' do I use '0' (30 h ~ 48 t/d) or do I use 0 (0 d/t ~ 00 h) ???

or another example:

cmp byte ptr [esi], ??? ; this is the checking compare to if you've got a value for your operand (the other is its out of bounds value of, for a dword, 9)

do I use '0' (30 h ~ 48 d/t) or 0 (00 h ~ 0 d/t) ???

-------------

ah, so I was just wrongly using the 'ASCII' binary values (in just trying to understand how the shl is working ~ I know I can just use the shifting code, but I want to understand it), instead of the actual 0-9 values in trying to do the shifting work on paper, to understand it. whoops (no wonder it was coming out so weirdly for me on paper), laughs.

EDIT: thank you for the examples of all of the types/ways of doing it, that helps a lot with understanding what is going on!

EDIT 2: (Timing their execution would be a good way to examine/delve into processing speed optimization, they'd make for a good test case, hehe)

------

I'll post my code up soon of what I got done so far (still just working on stuff up to the 'asci to decimal' function and that function itself too ~ I haven't yet started on the operation coding nor the 'decimal to ascii' coding) ... as I likely have some errors/syntax (still unsure of when I need to use the data type pointers vs not needing them, so I probably went a little overboard with them). I program too slow, sighs. This is another thing I really need to improve upon... but right now, I still waste a lot of time trying to come up with the needed logic and/or design constructs for my programs, which prevent me from quickly doing these programs (as well as not being a fast typer in general, let alone to even slower, for me, of code-syntax typing), sighs.

I don't have any testable code yet... I need to learn how to quickly set up code/program in assembly that can run/be tested... sighs.

jaynabonne
The shifting code is doing a specific manipulation to generate a x10. It uses the fact that 10 = 8 + 2. So "times 10" is "times 8 "+ "times 2". To get a "times 8", you shift left three times. To get a "times 2" you shift left once, or add the number again.

If eax had the number, then you can do this:

mov ebx, eax ; save number
shl eax, 3 ; *8
add eax, ebx '; *9
add eax, ebx '; *10

or

mov ebx, eax ; save number
shl eax, 3 ; *8
shl ebx, 1 ; *2
add eax, ebx '; *10

or

mov ebx, eax ; save number
add eax, ebx '; *2
add eax, ebx '; *3
add eax, ebx '; *4
add eax, ebx '; *5
add eax, ebx '; *6
add eax, ebx '; *7
add eax, ebx '; *8
add eax, ebx '; *9
add eax, ebx '; *10

or

mul eax, 10

HegemonKhan
Note: I edited my previous post (some extra questions there), if you wouldn't mind looking over it again to see my new questions, I'd be appreciative.

-------------------

here's my work so far... still trying to figure out the "program flow/order" and some functions too, ...

making the labels be procedures will probably help simplify it for me, ...

I'm just trying to work/brainstorm the functions and program flow/order out for now, but will probably move them to being procedures.

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to emulate/parse a mathematical equation

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

NULL_POINTER EQU 00h

INPUT_BUFFER_SIZE EQU 1Bh
OUTPUT_BUFFER_SIZE EQU 29h

SPACE EQU 20h

QUESTION_MARK EQU 3Fh

EQUAL EQU 3Dh

NEGATIVE EQU 2Dh

ADDITION EQU 2Bh
SUBTRACTION EQU 2Dh
MULTIPLICATION EQU 2Ah
DIVISON EQU 2Ch

ZERO EQU 30h
ONE EQU 31h
TWO EQU 32h
THREE EQU 33h
FOUR EQU 34h
FIVE EQU 35h
SIX EQU 36h
SEVEN EQU 37h
EIGHT EQU 38h
NINE EQU 39h

RETURN_ERROR EQU 01h

;***********
; VARIABLES
;***********

heading byte "redacted", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to evaluate/parse",\
" a mathematical equation", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_prompt byte "Enter a mathematical equation in the form of"\
", <value(space)operation(space)value> , : ", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

output_prompt byte CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED,\
"The result is:", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

division_by_zero byte "Error: Division by zero, result is undefined"\
, CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

overflow_flagged byte "Overflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

underflow_flagged byte "Underflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_buffer byte INPUT_BUFFER_SIZE dup (QUESTION_MARK)

output_buffer byte OUTPUT_BUFFER_SIZE dup (QUESTION_MARK)

operand_1 sdword QUESTION_MARK
operand_2 sdword QUESTION_MARK
result_value sdword QUESTION_MARK

return_code dword NULL_POINTER

bytes_written dword QUESTION_MARK
bytes_read dword QUESTION_MARK
handle_standard_out dword QUESTION_MARK
handle_standard_in dword QUESTION_MARK

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

;*******************************
; Get handle to standard output
;*******************************

invoke GetStdHandle, STD_OUTPUT_HANDLE
mov handle_stardard_out, eax

;******************************
; Get handle to standard input
;******************************

invoke GetStdHandle, STD_INPUT_HANDLE
mov handle_standard_in, eax

;**************
; Program Info
;**************

invoke WriteConsoleA, handle_standard_out, offset heading, \
sizeof heading, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset history, \
sizeof history, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset purpose, \
sizeof heading, offset bytes_written, NULL_POINTER

;***********************
; Prompt for User Input
;***********************

invoke WriteConsoleA, handle_standard_out, offset input_prompt, \
sizeof input_prompt, offset bytes_written, NULL_POINTER

;****************
; Get User Input
;****************

invoke ReadConsoleA, handle_standard_in, offset input_buffer, \
sizeof input_buffer, offset bytes_read, NULL_POINTER

;*********
; Program
;*********

Pre_Start:

mov esi, offset input_buffer
movzx ebp, byte ptr bytes_read

mov edi, offset output_buffer
movzx edx, OUTPUT_BUFFER_SIZE

xor eax, eax
xor ebx, ebx
xor ecx, ecx

Get_Operand_Size:

cmp esi[ebx], SPACE
je Set_Operand_Size

add ebx, byte ptr ONE

jmp Get_Operand_Size

Set_Operand_Size:

movzx ecx, ebx

Start:

call ASCII_To_Decimal (esi, ecx)

Addition:

adc operand_1, operand_2

Subtraction:

sbb operand_1, operand_2

Multiplication:

imul operand_1, operand_2

Division:

cmp operand_2, ZERO
je Division_By_Zero

idiv operand_1, operand_2

;***********************
; Display Output Prompt
;***********************

invoke WriteConsoleA, handle_standard_out, offset output_prompt, \
sizeof output_prompt, offset bytes_written, NULL_POINTER

;****************
; Display Output
;****************

invoke WriteConsoleA, handle_standard_out, offset output_buffer, \
sizeof xxx, offset bytes_written, NULL_POINTER

Invalid_Input:

movzx return_code, RETURN_ERROR

Finish:

invoke ExitProcess, return_code

Main endp


ASCII_To_Decimal proc stdcall uses ebx ecx edx esi edi ebp,

local sign_value byte ZERO

local Pre_Start_ASCII_To_Decimal:

cmp byte ptr [esi], ZERO
je Invalid_Input

cmp byte ptr [esi], NINE
je Invalid_Input

movzx edx, byte ptr eax
sub edx, ZERO
add eax, edx

local Start_ASCII_To_Decimal:

cmp ecx, ZERO
je Sign_ASCII_To_Decimal

movzx edx, eax
shl eax, byte ptr THREE
add eax, edx
add eax, edx

movzx edx, byte ptr esi[-ecx]
sub edx, ZERO
add eax, edx

sub ecx, byte ptr ONE

jmp Start_ASCII_To_Decimal

local Sign_ASCII_To_Decimal:

cmp esi[ebx], SIGN
jne Finish_ASCII_To_Decimal

neg eax

local Finish_ASCII_To_Decimal:

ret

ASCII_To_Decimal endp

Decimal_To_ASCII proc stdcall uses eax ebx edx esi edi ebp,

ret

Decimal_To_ASCII endp

end Main

jaynabonne
add      ebx, byte ptr ONE

This is going to kill you. You just want to increment ebx by 1 ("inc ebx" would do or "add ebx, 1"). You're adding the ASCII value of '1' onto it, which is adding 31h. Not what you want at all.

      cmp      byte ptr [esi],   ZERO
je Invalid_Input

cmp byte ptr [esi], NINE
je Invalid_Input

Go back and look at the original code. It was not "je". It was using the carry to test for below or above. The code should be saying, "If the value is below '0' or above '9', then jump to invalid input." What you have is "If the value is EQUAL to '0' or EQUAL to '9', then go invalid."

      shl      eax, byte ptr THREE
...
sub ecx, byte ptr ONE


You're conflating things again. You need those to be the values 3 and 1, not '3' (33h) and '1' (31h). I don't see any reason to use EQU's for common numbers like that, the way the code is above (that is, for bare numbers like 3 or 1). You would set an EQU to assign a name to a bare number (e.g. "DAYS_IN_WEEK" = 7), but simply having the symbolic name THREE for the number 3 adds no semantic content whatsoever and actually obscures things, given the extra indirection someone reading your code would need to go through to be sure what's going on. It's only marginally better to use ONE to refer to '1', but even then to me it seems more obscure, unless you name then "ONE_CHAR" or something to indicate the character value for '1'.


      movzx   edx, byte ptr eax
sub edx, ZERO
add eax, edx

local Start_ASCII_To_Decimal:


I have no idea what this code is meant to do. EAX doesn't even have a legitimate value at this point - it's not an input parameter and you've never assigned it a value.

cmp      esi[ebx], SIGN

Similarly, ebx has no value assigned either. Plus you want to check the first character in the string for the minus sign, so you need to do it up front. Otherwise, your bounds checks for '0'-'9' will force it to jump out before you even get to the first digit. (In other words, you need to have a special check and advance off of it before you begin looking at digits. Because it's not a digit.)

Finally, you're reading the string backwards, using your -ecx trick. You don't want to do that. You need to process the number from most significant to least significant as I showed above. If you read the number backwards, then you read the ones digit first, multiply it by 10, etc. which I hope you can see is wrong. :) If you do want to read it backwards, then you should maintain a multiplier which starts off 1 and then becomes 10, 100, etc in turn that you'd multiply onto your next digit to put it in the right place.

HegemonKhan
here's my updated (hopefully fully fixed up and logic is correct) code work:

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to emulate/parse a mathematical equation

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

NULL_POINTER EQU 00h

MAX_INPUT_BUFFER_SIZE EQU 1Bh
MAX_OUTPUT_BUFFER_SIZE EQU 29h

SPACE EQU 20h

EQUAL EQU 3Dh

NEGATIVE EQU 2Dh

ADDITION EQU 2Bh
SUBTRACTION EQU 2Dh
MULTIPLICATION EQU 2Ah
DIVISON EQU 2Ch

ZERO_ASCII EQU 30h

RETURN_ERROR EQU 01h

;***********
; VARIABLES
;***********

heading byte "redacted",
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to evaluate/parse",\
" a mathematical equation", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_prompt byte "Enter a mathematical equation in the form of"\
", <value(space)operation(space)value> , : ", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

output_prompt byte CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED,\
"The result is:", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

division_by_zero byte "Error: Division by zero, result is undefined"\
, CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

overflow_flagged byte "Overflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

underflow_flagged byte "Underflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_buffer byte INPUT_BUFFER_SIZE dup (?)

output_buffer byte OUTPUT_BUFFER_SIZE dup (?)

operand_1 sdword ?
operand_2 sdword ?
result_value sdword ?

return_code dword 00h

bytes_read dword ?
bytes_written dword ?
handle_standard_out dword ?
handle_standard_in dword ?

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

;*******************************
; Get handle to standard output
;*******************************

invoke GetStdHandle, STD_OUTPUT_HANDLE
mov handle_stardard_out, eax

;******************************
; Get handle to standard input
;******************************

invoke GetStdHandle, STD_INPUT_HANDLE
mov handle_standard_in, eax

;**************
; Program Info
;**************

invoke WriteConsoleA, handle_standard_out, offset heading, \
sizeof heading, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset history, \
sizeof history, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset credit, \
sizeof credit, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset purpose, \
sizeof heading, offset bytes_written, NULL_POINTER

;***********************
; Prompt for User Input
;***********************

invoke WriteConsoleA, handle_standard_out, offset input_prompt, \
sizeof input_prompt, offset bytes_written, NULL_POINTER

;****************
; Get User Input
;****************

invoke ReadConsoleA, handle_standard_in, offset input_buffer, \
sizeof input_buffer, offset bytes_read, NULL_POINTER

;*********
; Program
;*********

Pre_Start:

mov esi, offset input_buffer
movzx ebp, bytes_read

mov edi, offset output_buffer
movzx edx, OUTPUT_BUFFER_SIZE

xor eax, eax
xor ebx, ebx
xor ecx, ecx

Get_Operand_Size:

cmp esi[ebx], SPACE
je Set_Operand_Size

cmp esi[ebx], CARRIAGE_RETURN
je Set_Operand_Size

add ebx, 01h

jmp Get_Operand_Size

Set_Operand_Size:

movzx ecx, ebx

Start:

call ASCII_To_Decimal (esi, ecx)
movzx operand_1, eax
add ebx, 01h
jnp Get_Operand_Size

; need to handle getting operator

; need to handle getting operand_2

Addition:

adc operand_1, operand_2

Subtraction:

sbb operand_1, operand_2

Multiplication:

imul operand_1, operand_2

Division:

cmp operand_2, 00h
je Division_By_Zero

idiv operand_1, operand_2

;***********************
; Display Output Prompt
;***********************

invoke WriteConsoleA, handle_standard_out, offset output_prompt, \
sizeof output_prompt, offset bytes_written, NULL_POINTER

;****************
; Display Output
;****************

invoke WriteConsoleA, handle_standard_out, offset output_buffer, \
sizeof output_buffer, offset bytes_written, NULL_POINTER

Invalid_Input:

movzx return_code, RETURN_ERROR

Finish:

invoke ExitProcess, return_code

Main endp


ASCII_To_Decimal proc stdcall uses ebx ecx edx esi edi ebp,

local Pre_Start_ASCII_To_Decimal:

xor ebx, ebx

cmp esi[ebx], SIGN
je Sign_To_Zero_ASCII_To_Decimal

cmp byte ptr esi[ebx], 00h
jb Invalid_Input

cmp byte ptr esi[ebx], 09h
ja Invalid_Input

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h

local Start_ASCII_To_Decimal:

cmp ecx, 00h
je Is_Sign_ASCII_To_Decimal

movzx edx, eax
shl eax, 03h
add eax, edx
add eax, edx

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h
sub ecx, 01h

jmp Start_ASCII_To_Decimal

local Sign_To_Zero_ASCII_To_Decimal:

movzx edx, byte ptr eax
sub edx, ZERO_ASCII
add eax, edx

add ebx, 01h

jmp Start_ASCII_To_Decimal

local Is_Sign_ASCII_To_Decimal:

cmp esi[01h], SIGN
jne Finish_ASCII_To_Decimal

neg eax

local Finish_ASCII_To_Decimal:

ret

ASCII_To_Decimal endp

Decimal_To_ASCII proc stdcall uses eax ebx edx esi edi ebp,

ret

Decimal_To_ASCII endp

end Main

jaynabonne
This is much better! Still a few issues, which should be easily cleaned up.

     movzx   edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h

I'm not sure why you're handling the first character specially. If you simply xor eax, eax and fall into the loop, it will work. The way it is now, since you've processed the first character, you should decrement ecx before entering the loop. Otherwise, you'll try to process one too many characters.

Also, you don't specify eax as an input parameter, and yet you're using the fact (I assume) that it's been zeroed outside the function, which is really dangerous (since it's not obvious to the caller). If you do want to prime the loop this way, besides incrementing ecx, you should just mov the value into eax and subtract ZERO_CHAR, without all the edx business (which just uses an extra register when you can use eax alone). If you do want to keep the code you have, be sure to zero out eax before using it. Otherwise, the second time you call the function to get the second number, you're going to have a rude surprise.

The sign check can be much simpler (and it doesn't really work as is, since it doesn't advance the off the sign character and it also doesn't even read from the buffer - it just uses eax, which is 0, and then subtracts CHAR_ZERO from it, which makes it go negative. You really just need something like "if the first character in the buffer is '-', skip it (inc ebx)"). Then your later check will negate the number.

And at the end where you check the sign character again, you use esi[01h]. But the first character is at offset 0. So it should be esi[0].

Hope that helps! :)

HegemonKhan
I'm still a bit confused with Procedures and Parameters (for some reason in assembly I'm not understanding them - they seem a bit different to me from high level languages), along with the use of the registers (either directly or through using them with Parameters).

If I understand correctly, the 'uses' key-word/command, merely copies the registers' (original) values into/onto the stack for storage, so that once you're done with the procedure, those original values are loaded/copied/moved back into your registers, over-writing whatever values were currently in them from your procedure operations.

With this understanding, then why would you need to indirectly (via parameters) use registers in/for your procedures ???

Unofrtunately, the required given procedure only mentions/specifies that the array source address (in esi) and its size/length (in ecx) are to be used as its args.

about the parameters/args, I presume this means I have to assign the esi and ecx to parameter_variables, which I can use directly or pass back into esi and ecx (not undertanding the point of this... seems pointless/redundant/un-needed extra operation/s) ???

-------

I'm not sure what you mean in regards to the 'eax' not being known to the caller? (I am aware that I need to xor eax before doing operand 2)

do you mean that I should have the 'xor eax' inside of the 'asci to decimal' procedure? I'm not sure if this can be done or not (at least probably not with my haphazard design setup/program flow/logic) ...

--------

I'm a bit confused too on what you say in regards to the sign operations, I'm not sure what I need to do or change, I'm not quite able to follow/see what you're trying to tell/explain to me (I'm probably having these issues due to being so tired).

-------

also...

I'm a bit confused on how I would adjust the code to not have to use the 'edx' step with the 'ascii to decimal' operations...

... let me post up my new code work, so you can see with what I've done, and got to work with (or need to change/fix up, as not sure if some of my logic and etc is right, also my program flow/logic is a bit confusing and not optimized, as I'm having a bit of trouble with how to design this stuff well, laughs-sighs)

I'm still having quite a lot of trouble with good program flow/logic/design... this project is a bit too complex for me to get it and design it well at least currently (and I've already am nearly up on my time to work on this program, as it's due in exactly 15 hrs from now, I probably won't be able to complete it, not even sure if I can even get to tackling the arithmetic operations and then the decimal to ascii, and I'm really getting sick of this program, as I've been working on it literally non-stop since saturday - which was why I was so brain dead and getting confused with the equates and the literal values vs the ascii values - sleep has been few and far in-between as I'm really trying to get this program done as much as I can, yet I'm trying to get as much work on it as I can, a miracle will be needed to get it completed, let alone working... with more time, I can definately get this figure out, with probably some needed help from you, but my time's almost out), sighs...

HegemonKhan
here's my current code work:

(sorry about not having any comments, haven't had the time for them, as I'm just trying to progress on the actual program)

(I know my program design/flow/logic is really bad and haphazard, my apologizes with having to try to figure it out without any comments... if I had more time, I'd have the comments so you could at least follow it along a bit more easily. It takes me time and multiple versions to get my programs a bit organized, as I'm still really a newbie at programming. As you can see, my initial program/coding is really bad and disorganized, I slowly with time and tries, get it cleaned up and somewhat readable/followable ... I'm really weak at good programming logic and program flow/design, still, obviously. I'm a bit better with the high level languages, as I've been doing them longer, having more experience with them, but this assembly is new, and you're seeing me at the beginning, probably like how I was when I first learned of quest and tried to learn to code with quest ~3 yrs ago, lol)

(I'm not that smart, so I have to rely on lots of trial and error, to slowly get to better designed code and programs, it takes me a lot of revisions)

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to emulate/parse a mathematical equation

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

NULL_POINTER EQU 00h

MAX_INPUT_BUFFER_SIZE EQU 1Bh
MAX_OUTPUT_BUFFER_SIZE EQU 29h

SPACE EQU 20h

EQUAL EQU 3Dh

NEGATIVE EQU 2Dh

ADDITION EQU 2Bh
SUBTRACTION EQU 2Dh
MULTIPLICATION EQU 2Ah
DIVISON EQU 2Ch

ZERO_ASCII EQU 30h

RETURN_ERROR EQU 01h

;***********
; VARIABLES
;***********

heading byte "redacted",
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to evaluate/parse",\
" a mathematical equation", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_prompt byte "Enter a mathematical equation in the form of"\
", <value(space)operation(space)value> , : ", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

output_prompt byte CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED,\
"The result is:", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

division_by_zero byte "Error: Division by zero, result is undefined"\
, CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

overflow_flagged byte "Overflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

underflow_flagged byte "Underflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_buffer byte INPUT_BUFFER_SIZE dup (?)

output_buffer byte OUTPUT_BUFFER_SIZE dup (?)

operator_variable byte ?

operand_1 sdword ?
operand_2 sdword ?
result_value sdword ?

return_code dword 00h

bytes_read dword ?
bytes_written dword ?
handle_standard_out dword ?
handle_standard_in dword ?

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

;*******************************
; Get handle to standard output
;*******************************

invoke GetStdHandle, STD_OUTPUT_HANDLE
mov handle_stardard_out, eax

;******************************
; Get handle to standard input
;******************************

invoke GetStdHandle, STD_INPUT_HANDLE
mov handle_standard_in, eax

;**************
; Program Info
;**************

invoke WriteConsoleA, handle_standard_out, offset heading, \
sizeof heading, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset history, \
sizeof history, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset purpose, \
sizeof heading, offset bytes_written, NULL_POINTER

;***********************
; Prompt for User Input
;***********************

invoke WriteConsoleA, handle_standard_out, offset input_prompt, \
sizeof input_prompt, offset bytes_written, NULL_POINTER

;****************
; Get User Input
;****************

invoke ReadConsoleA, handle_standard_in, offset input_buffer, \
sizeof input_buffer, offset bytes_read, NULL_POINTER

;*********
; Program
;*********

Pre_Start:

mov esi, offset input_buffer

mov edi, offset output_buffer
movzx edx, OUTPUT_BUFFER_SIZE

xor ebx, ebx
xor ecx, ecx

Start:

xor eax, eax

Get_Operand_Length:

cmp byte ptr esi[ebx], SPACE
je Set_Operand_1_Length

cmp byte ptr esi[ebx], CARRIAGE_RETURN
je Set_Operand_2_Length

add ebx, 01h

jmp Get_Operand_Length

Set_Operand_1_Length:

movzx ecx, ebx
sub ecx, ebp

Operand_1:

call ASCII_To_Decimal (esi, ecx)
movzx operand_1, eax
movzx ebp, ebx

Skip_To_Operator:

add ebx, 02h

Store_Operator:

movzx operator_variable, byte ptr esi[ebx]

Skip_Past_Operator_To_Handling_Operand_2:

add ebx, 02h
jnp Start

Set_Operand_2_Length:

movzx ecx, ebx
sub ecx, ebp

Operand_2:

call ASCII_To_Decimal (esi, ecx)
movzx operand_2, eax
movzx ebp, ebx

Determining_Arithmetic_Operation:

movzx ebp, operator_variable

cmp ebp, ADDITION
je Addition

cmp ebp, SUBTRACTION
je Subtraction

cmp ebp, MULTIPLICATION
je Multiplication

cmp ebp, DIVISION
je Division

Addition:

adc operand_1, operand_2

jmp Output

Subtraction:

sbb operand_1, operand_2

jmp Output

Multiplication:

imul operand_1, operand_2

jmp Output

Division:

cmp operand_2, 00h
je Division_By_Zero

idiv operand_1, operand_2

jmp Output

Output:

;***********************
; Display Output Prompt
;***********************

invoke WriteConsoleA, handle_standard_out, offset output_prompt, \
sizeof output_prompt, offset bytes_written, NULL_POINTER

;****************
; Display Output
;****************

invoke WriteConsoleA, handle_standard_out, offset output_buffer, \
sizeof output_buffer, offset bytes_written, NULL_POINTER

Invalid_Input:

movzx return_code, RETURN_ERROR

Finish:

invoke ExitProcess, return_code

Main endp


ASCII_To_Decimal proc stdcall uses ebx ecx edx esi edi ebp,

local Pre_Start_ASCII_To_Decimal:

xor ebx, ebx

cmp byte ptr esi[ebx], SIGN
je Next_Index_ASCII_To_Decimal

cmp byte ptr esi[ebx], 00h
jb Invalid_Input

cmp byte ptr esi[ebx], 09h
ja Invalid_Input

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h

local Start_ASCII_To_Decimal:

cmp ecx, 00h
je Is_Sign_ASCII_To_Decimal

movzx edx, eax
shl eax, 03h
add eax, edx
add eax, edx

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h
sub ecx, 01h

jmp Start_ASCII_To_Decimal

local Next_Index_ASCII_To_Decimal:

add ebx, 01h
sub ecx, 01h

jmp Start_ASCII_To_Decimal

local Is_Sign_ASCII_To_Decimal:

cmp byte ptr esi[00h], SIGN
jne Finish_ASCII_To_Decimal

neg eax

local Finish_ASCII_To_Decimal:

ret

ASCII_To_Decimal endp

Decimal_To_ASCII proc stdcall uses eax ebx edx esi edi ebp,

ret

Decimal_To_ASCII endp

end Main

HegemonKhan
newer code:

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to emulate/parse a mathematical equation

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

NULL_POINTER EQU 00h

MAX_INPUT_BUFFER_SIZE EQU 1Bh
MAX_OUTPUT_BUFFER_SIZE EQU 29h

SPACE EQU 20h

EQUAL EQU 3Dh

NEGATIVE EQU 2Dh

ADDITION EQU 2Bh
SUBTRACTION EQU 2Dh
MULTIPLICATION EQU 2Ah
DIVISON EQU 2Ch

ZERO_ASCII EQU 30h

RETURN_ERROR EQU 01h

;***********
; VARIABLES
;***********

heading byte "redacted",
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to evaluate/parse",\
" a mathematical equation", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_prompt byte "Enter a mathematical equation in the form of"\
", <value(space)operation(space)value> , : ", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

output_prompt byte CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED,\
"The result is:", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

division_by_zero byte "Error: Division by zero, result is undefined"\
, CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

overflow_flagged byte "Overflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

underflow_flagged byte "Underflow occurred", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

input_buffer byte INPUT_BUFFER_SIZE dup (?)

output_buffer byte OUTPUT_BUFFER_SIZE dup (?)

operator_variable byte ?

operand_1 sdword ?
operand_2 sdword ?
result_value sdword ?

return_code dword 00h

bytes_read dword ?
bytes_written dword ?
handle_standard_out dword ?
handle_standard_in dword ?

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

;*******************************
; Get handle to standard output
;*******************************

invoke GetStdHandle, STD_OUTPUT_HANDLE
mov handle_stardard_out, eax

;******************************
; Get handle to standard input
;******************************

invoke GetStdHandle, STD_INPUT_HANDLE
mov handle_standard_in, eax

;**************
; Program Info
;**************

invoke WriteConsoleA, handle_standard_out, offset heading, \
sizeof heading, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset history, \
sizeof history, offset bytes_written, NULL_POINTER

invoke WriteConsoleA, handle_standard_out, offset purpose, \
sizeof heading, offset bytes_written, NULL_POINTER

;***********************
; Prompt for User Input
;***********************

invoke WriteConsoleA, handle_standard_out, offset input_prompt, \
sizeof input_prompt, offset bytes_written, NULL_POINTER

;****************
; Get User Input
;****************

invoke ReadConsoleA, handle_standard_in, offset input_buffer, \
sizeof input_buffer, offset bytes_read, NULL_POINTER

;*********
; Program
;*********

Pre_Start:

mov esi, offset input_buffer

mov edi, offset output_buffer
movzx edx, OUTPUT_BUFFER_SIZE

xor ebx, ebx
xor ecx, ecx

Start:

xor eax, eax

Get_Operand_Length:

cmp byte ptr esi[ebx], SPACE
je Set_Operand_1_Length

cmp byte ptr esi[ebx], CARRIAGE_RETURN
je Set_Operand_2_Length

add ebx, 01h

jmp Get_Operand_Length

Set_Operand_1_Length:

movzx ecx, ebx
sub ecx, ebp

Operand_1:

call ASCII_To_Decimal (esi, ecx)
movzx operand_1, eax
movzx ebp, ebx

Skip_To_Operator:

add ebx, 02h

Store_Operator:

movzx operator_variable, byte ptr esi[ebx]

Skip_Past_Operator_To_Handling_Operand_2:

add ebx, 02h
jnp Start

Set_Operand_2_Length:

movzx ecx, ebx
sub ecx, ebp

Operand_2:

call ASCII_To_Decimal (esi, ecx)
movzx operand_2, eax

Storing_Actual_Array_Length:

; movzx xxx_variable, ebx
; movzx ecx, ebx

Zeroing_Registers:

; xor eax, eax
; xor ecx, ecx
; xor edx, edx

Determining_Arithmetic_Operation:

movzx ebp, operator_variable

cmp ebp, ADDITION
je Addition

cmp ebp, SUBTRACTION
je Subtraction

cmp ebp, MULTIPLICATION
je Multiplication

cmp ebp, DIVISION
je Division

Addition:

mov eax, dword ptr operand_1 + 4
mov ecx, dword ptr operand_2 + 4
adc eax, ecx
mov dword ptr result_value + 4, eax

jmp Conversion

Subtraction:

mov eax, dword ptr operand_1 + 4
mov ecx, dword ptr operand_2 + 4
sbb eax, ecx
mov dword ptr result_value + 4, eax

jmp Conversion

Multiplication:

mov eax, dword ptr operand_1
mov ecx, dword ptr operand_2
imul ecx
mov dword ptr result_value, eax

jmp Conversion

Division:

mov eax, dword ptr operand_1
mov ecx, dword ptr operand_2
idiv ecx
mov dword ptr result_value, eax

jmp Conversion

Conversion:

call Decimal_To_ASCII

Output:

;***********************
; Display Output Prompt
;***********************

invoke WriteConsoleA, handle_standard_out, offset output_prompt, \
sizeof output_prompt, offset bytes_written, NULL_POINTER

;****************
; Display Output
;****************

invoke WriteConsoleA, handle_standard_out, offset output_buffer, \
sizeof output_buffer, offset bytes_written, NULL_POINTER

jmp Finish

Invalid_Input:

movzx return_code, RETURN_ERROR

Finish:

invoke ExitProcess, return_code

Main endp


ASCII_To_Decimal proc stdcall uses ebx ecx edx esi edi ebp,

local Pre_Start_ASCII_To_Decimal:

xor ebx, ebx

cmp byte ptr esi[ebx], SIGN
je Next_Index_ASCII_To_Decimal

cmp byte ptr esi[ebx], 00h
jb Invalid_Input

cmp byte ptr esi[ebx], 09h
ja Invalid_Input

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h

local Start_ASCII_To_Decimal:

cmp ecx, 00h
je Is_Sign_ASCII_To_Decimal

movzx edx, eax
shl eax, 03h
add eax, edx
add eax, edx

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h
sub ecx, 01h

jmp Start_ASCII_To_Decimal

local Next_Index_ASCII_To_Decimal:

add ebx, 01h
sub ecx, 01h

jmp Start_ASCII_To_Decimal

local Is_Sign_ASCII_To_Decimal:

cmp byte ptr esi[00h], SIGN
jne Finish_ASCII_To_Decimal

neg eax

local Finish_ASCII_To_Decimal:

ret

ASCII_To_Decimal endp

Decimal_To_ASCII proc stdcall uses eax ebx edx esi edi ebp,

ret

Decimal_To_ASCII endp

end Main

jaynabonne

If I understand correctly, the 'uses' key-word/command, merely copies the registers' (original) values into/onto the stack for storage, so that once you're done with the procedure, those original values are loaded/copied/moved back into your registers, over-writing whatever values were currently in them from your procedure operations.

With this understanding, then why would you need to indirectly (via parameters) use registers in/for your procedures ???

Unofrtunately, the required given procedure only mentions/specifies that the array source address (in esi) and its size/length (in ecx) are to be used as its args.


That was my point. :) You're not setting eax to *anything* before you do your first "add eax, edx". So you're effectively using a register with an unknown value. It happens to be 0 in this case initially because you have xor'd it at the program start (for what reason, I don't know), which means the first number will *happen* to come out correctly, but not after that. You defintiely *don't* want to do that. If you want to do the add (which I still don't think you want or need to do initially), then you should at least set eax to 0 beforehand *in the function*.

As far as the sign goes, it looks like you fixed it - if there is a sign, you just increment past it, which is good.

You still have the problem of processing one too many characters in the non-signed case, since you're processing the first one outside the loop and not decrementing ecx when you add ebx, 01h before entering the loop. Rather than do that though, I still think you should just xor eax, eax to prime it and fall into the loop. There's no reason to process the first digit specially. (In fact, think about what happens if ecx comes in as 0! Or 1 even.)

Also, your check for the digit being in the range '0' - '9' is only being done on the first character. You should either move the check into the loop or just get rid of it (the requirements for this state the numbers will be legal. So validating them is not necessary. But if you're going to do it, then do it for all the characters).

So I think this code:

   local    Pre_Start_ASCII_To_Decimal:

xor ebx, ebx

cmp byte ptr esi[ebx], SIGN
je Next_Index_ASCII_To_Decimal

cmp byte ptr esi[ebx], 00h
jb Invalid_Input

cmp byte ptr esi[ebx], 09h
ja Invalid_Input

movzx edx, byte ptr esi[ebx]
sub edx, ZERO_CHAR
add eax, edx

add ebx, 01h
local Start_ASCII_To_Decimal:


should just be this:
   local    Pre_Start_ASCII_To_Decimal:
xor ebx, ebx
xor eax, eax

cmp byte ptr esi[ebx], SIGN
jne Start_ASCII_To_Decimal

inc ebx
dec ecx

local Start_ASCII_To_Decimal:


or this if you want to validate the characters:

   local    Pre_Start_ASCII_To_Decimal:
xor ebx, ebx
xor eax, eax

cmp byte ptr esi[ebx], SIGN
jne Start_ASCII_To_Decimal

inc ebx
dec ecx

local Start_ASCII_To_Decimal:

cmp byte ptr esi[ebx], 00h
jb Invalid_Input

cmp byte ptr esi[ebx], 09h
ja Invalid_Input



Moving on, for this code:

   Addition:

mov eax, dword ptr operand_1 + 4
mov ecx, dword ptr operand_2 + 4
adc eax, ecx
mov dword ptr result_value + 4, eax

jmp Conversion

I don't know why you're adding 4 onto the variable. The variable is only 4 bytes big to begin with. so you're effectively moving beyond it. You don't do that for multiply and divide, so perhaps that was something needed to be fixed? (The same goes for subtraction.)

And you're using "adc" instead of "add". Instead of simply adding the two numbers, you're actually adding the two numbers *plus the carry*, but it's unclear what the carry will actually be at that point. So you could randomly end up with off by 1 problems. The same goes with sbb in your subtraction. A simple "sub" will do.

You would use "adc" and "sbb" when you're doing multi-stage additions or subtractions, where you need to propagate any carry or borrow from the previous addition/subtraction to the next stage. You can also use them for weird tricks, but we don't need to go there... :) I can't see any reason to use them in this case.

HegemonKhan
ah, thanks for all the help! I have no clue on how to do the arithmetic operations... we got power point slides on the instruction sets, so I was just using them, and saw that the example was a 'word' with '+2', so that's where my '+4' came from, as I'm like okay maybe I do +4 for dwords lol, as I think I'm using dwords as the regs are dwords/32 bits, or should I use a different data type?

alright, so I don't need to use the adc/sbb, right? I just used them as I thought I'd need them to deal with the flags (carry/sign/zero/etc) ... I really have no clue on any of this arithmetic stuff, I've never done it before, and don't understand it at all.

how do I handle the arithmetic, and possibly flag usages too ???

I found an online edu resource that gets a bit into explaining the multiplication for me:

algorithms: repeated addition, 'shift and' add, parallel multiplication,

negative numbers:

convert to positive, multiply, then convert back to negative if only one was negative,
etc etc etc

------------

how much do the instruction sets handle, vs what you have to do/account for ???

jaynabonne
HegemonKhan wrote:ah, thanks for all the help! I have no clue on how to do the arithmetic operations... we got power point slides on the instruction sets, so I was just using them, and saw that the example was a 'word' with '+2', so that's where my '+4' came from, as I'm like okay maybe I do +4 for dwords lol, as I think I'm using dwords as the regs are dwords/32 bits, or should I use a different data type?



dwords should be fine. You said initially that you need to support 32-bit values, and that serves that perfectly.

HegemonKhan wrote:alright, so I don't need to use the adc/sbb, right? I just used them as I thought I'd need them to deal with the flags (carry/sign/zero/etc) ... I really have no clue on any of this arithmetic stuff, I've never done it before, and don't understand it at all.

how do I handle the arithmetic, and possibly flag usages too ???


You can use the overflow bit after addition or subtraction to see if it overflowed (jo/jno, etc).

For multiplication, if you use the single operand form of an IMUL (e.g. IMUL EBX), then it multiples EAX by the operand, and the 64-bit result is put in EDX:EAX. So you'd be able to check for overflow by seeing if EDX is not either 0 (for a positive result) or 0xFFFFFFFF (for a negative result). Any thing other than those two would indicate bits straying beyond 32-bits and would be an overflow

For division, it should never overflow, as you can't divide two 32-bit numbers and have it exceed 32-bits (the largest you could have is MAXINT / 1, which is still just MAXINT). But you do need to check for 0 in the divisor (what you're dividing by) before you divide. If you try to divide by 0, it gives an exception!

HegemonKhan wrote:I found an online edu resource that gets a bit into explaining the multiplication for me:

algorithms: repeated addition, 'shift and' add, parallel multiplication,

negative numbers:

convert to positive, multiply, then convert back to negative if only one was negative,
etc etc etc

------------

how much do the instruction sets handle, vs what you have to do/account for ???

You don't need to do all the shifts and adds and things. We had to do that back before processors had multiply and divide instructions, but there is just no need to any more, unless you're exploring how it works (which you don't need to do for this assignment, and it just complicates things).

HegemonKhan
thank you again, for all of your help Jay!

let me see if I can learn the rest of this program from the solution code/program that will be available after my class today, and if not then I can ask you for help as I need it. I really appreciate all of your help, as I really want to learn assembly, and get better at writing code/program for/with it. And I of course need to learn these basics of bit shifting, arithmetic, and understanding and working with the flags. I'm really greatful for the tremendous help you've already given me, I've been learning so much, thank you Jay!

jaynabonne
No problem! Good luck and have fun. ;)

HegemonKhan
we got one more assembly assignment (having c++ run an assembly program; assembly obviously can't be inline/inside of the c++ file), but we're now otherwise shifting over to computer architecture now... this will be fun... hopefully I can get this stuff... never done any electronics/electricty/engineering/physics stuff, so it might be a bit daunting for me... especially all of the logic gates and etc... but meh, I got to learn sometime, and it's now hehe, FULL SPEED AHEAD! (and if I fail the class, meh, at my age the bad grade doesn't mean much, it's more worthwhile to get what I can out of the class for what I paid for the class vs dropping out just to not get a bad grade. I take it again, and hopefully do better, now having some learning/knowledge already for the second attempt at it. The extra cost/money ain't cool, but meh, I'm trying/tried my best, sometimes people just fail at some things, and it takes them more tries, than other people. Ya, it'd be nice only need one attempt and success, but that's not always the case).

-------------

I had casually known a really smart student/peer in high school, who out of high school, got accepted to this college/university, I don't know how it compares to other universities/colleges (I'm not that learned on quality of university CS courses ~ been lazy), but I think it is pretty prestigious... as this person/student/peer was REALLY REALLY REALLY smart, really really good at math, physics, and programming. Anyways, it's daunting, at how little I actually know/progressed in CS, compared to what's out there... a PhD, is a long long long long ways off, laughs, sighs... Just learning what I can now at a junior college before going to a university to work towards my bachelors, and maybe some kind of job, lol. Masters and Ph. D, are far off in the future if things go well... sighs.

https://www.cs.hmc.edu/program/course-descriptions/

so, much to still learn... and probably way beyond my ability (I'm not good at math, sighs).

jaynabonne
If it makes you feel any better, I don't even have a degree. I was already working as a programmer while in college, and for various reasons, I took a "leave of absence" from school and have never gone back. Of course, I was doing what I loved, and the past several decades have been a lifelong course in computer science. :) There's no doubt lots I don't know (CS is a huge field), and I often wonder what I would have learned in a more formal environment, but I don't regret it. Sometimes I think about going back, but... what does a degree get me that knowledge from something like Udemy won't?

(Of course, I do have the luxury of lots of working experience, which helps me when getting jobs. I wouldn't even begin to suggest to anyone to *not* get a degree when starting out. It just happened to work out the way it did for me.)

Back to your first point, I was an electronics hobbyist before the programming bug bit me (casual reading and experimenting and taking things apart to see how they worked during my teen years). If you have any questions about any of that, feel free to ask here as well. I may not know the answer, but I might. :)

HegemonKhan
Definately, especially with the U.S. school system... it really is becoming obsolete, some U.S. schools have just somewhat started to modernize, but lots of the old teaching practices are still used (droning lectures... read boring book chapters, take tests, etc etc etc, argh!), which just don't work anymore in today's high speed and time-scarce world, which despite that, actually require much more hands-on and practice/experience/repetition of the material, too. We got the internet, we can look up the info/terms/etc, what we can't look up is guidance/practice/eperience/etc of topics, designs, logic, and etc programming aspects.

Ya, most poeple learn stuff on their own (internet). If you alredy learned the material on your own, you ace school classes. If you haven't you struggle... this is epsecially the case with programming classes. I already learned a bit of high level language stuff thanks to quest, so C++/Java were mostly easy classes. But, I've never learned assembly language ahead of time, so I'm struggling with it (the class) now. I see this with everyone. People who're struggling in the programming classes are the ones who're learning the material for their first time, and for everyone who breeze through the class with ease, are those who already know the material (makes you think why they're even taking the class, aside from merely being required to do so, unless there's some kind of test to bypass the class to more advanced classes, which is generally rare). That's been my experience with the teaching. They teach more for those who already know a bit of the material (here's some lecturing, and here's the programming assignment, go do it), whereas teaching for those new to the material, should involve more walking through programs and concepts/designs/logic, etc. Don't get me wrong, teachers don't have much time, for teaching, so they do the best of with what they got, as a lot of my classes only meet once a week and are only 3 hrs, so that's not much time for the teacher to teach a more guiding/walkthrough approach (whereas 3 hrs is a long long long time for lectures, argh. Scientifically, humans can only do an hr of lecture, any time beyond an hour, we just can't maintain attention upon it)

HegemonKhan
new lab asssignment (last assembly lab, after this, it's just the digital circuitry project with the logic gates and etc):

write a (32-bit) assembly procedure that will be called from a C++ (32-bit) program. Can't do in-line assembly in the C++ program obviously. Will be provided the C++ program that will call/run our assembly procedure.

Obviously, also can't do/use:

.MODEL flat, C
MyProc proc C
specifying a start label in your end directive

purpose:

the assembly procedure is to compress a data buffer using RLE (Run Length Encoding).

the function prototype for the RLE procedure:

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char OutputBuf)

Input:

InputBuf ; Buffer to compress

InputLength ; number of bytes in InputBuf

Output:

OutputBuf ; RLE converted data is returned here

Returns:

; size of RLE data

this algoritm can be used to decode an RLE encode buffer:

Read the first byte of data, if the two most significant bits are set, then the 6 least significant bytes represent a repeat count, and the next byte is the actual data byte to be repeated. If the two most significant bits are not set then the repeat count is equal to 1 and the byte read is the actual byte.

------------

I'm not even really understanding exactly what I'm doing in this assignment... so, I'm finding the instructions a bit vague as I'm unable to understand exactly what I'm doing here with this assignment.

I guess I'm taking some kind of data, and trying to compress it into a smaller size. (C++ program calls on/runs the assembly procedure to...) Am I taking in the RLE encoded data, and compressing that data into a smaller size to be then used by the C++ program?

Also, this assignment sounds/seems like it's similar to the virtual CPU emulation, where I was reading/using the opcodes, and doing various operations based upon them. Is generally the same thing to be used for this assignment, or no?

at least for me, the C++ program doesn't help too much with understanding what I'm to do with this assignment (and/or I'm just not able to understand it). Though if you want to see it, I can post/pm it to you, as maybe it can help you with what's going on, better than it does for me.

---------

I'm not sure on the correct syntax for the prototype... this is my guess at it:

RLE Encode PROTO : BYTE, : DWORD, : BYTE

; do I need to include the return type into the prototype? if yes, does it go before the label or somewhere in the list of parameter data types (and where, as I know placement matters, as this matches up with the push-pop-to-from-stack and parameter-order that the call uses) ???

; was I correct to use 'BYTE' for 'char' ??? (is 'char' 1 byte?), or should it be 'WORD' (is 'char' 2 bytes?), or should it be 'DWORD' (is 'char' the C size of 'int', = 4 bytes?), or should I use 'char' (I didn't notice a font color change though when I typed in 'char' in assembly file, that's why I'm guessing at using 'BYTE' instead)

; I presume I can place prototypes anywhere (or at least above/before the '.data' section, or do the prototypes need to go into the '.data' section, or even maybe-possibly the '.code' section?)

-------

also, for the actual procedure, do I use 'char' or ( do I use 'byte/word/dword' ) as the parameters' date type ???

-------

right now, I just need some help with the general direction of what to do/what this assignment is about... I'm just a bit lost in how to go about just starting this assignment, so I'm not sure where or what to even begin with doing or figuring out.

------

I've done badly on some (enough that I'm probably not even going to pass the class now) assignments/labs and tests, so I'm just interested now in just learning how to do this assembly stuff, so if (which is likely now, unless I ace everything else which is obviously not possible with already my struggle with the material/programming) I need to take the class again, I'll have some knowledge and practice on how to do this assembly programming which I'm ve been struggling with this time around, so I'll do better next time I take the class, hopefully understanding assembly programming well by then, and do much better the second time around. I just want to have these assignments/programs explained, so I can learn and practice them, hopefully becoming good at them, and understand better how to program in assembly and related conceptual understandings of doing assembly programming. I'm still really struggling with the bit manipulation stuff too, sighs.

(After the symester/class, if you don't mind..., I'd like to revisit the, in general bit-manipulation and bit-arithmetic, programming, as I'm still really struggling with that stuff still, sighs, and it's extremely important/vital to learn, as it's a major use/fundamentals/basics in/for/with doing assembly programming)

Again, I'm so greatful for all the help you've given me already on understanding and learning as much as I have of assembly thanks to your help, Jay!

----------

in the meantime, I'll be trying to see what I can research on how to do this (not the old arithment one, I mean this current RLE) assignment, on my own too.

jaynabonne
Well, this class certainly gets into interesting things. :)

So if I read this right, you need to write the encoding part of an encoder/decoder pair. You will receive a buffer of unencoded data, and you'll need to encode it and write it to the output buffer. It tells you how to decode, but I don't think that's what you need to code. At least, that's what I took away.

The prototype you have looks wrong to me. I think it should be this:

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf);

It takes a char* input buffer and the length of that input buffer, it writes the encoded data into OutputBuf (which is also a pointer), and then it returns the length of the encoded data, so you know how big the encoded data ended up being.

It is correct to use BYTE for the data type; you'll be reading and writing bytes. (Technically, the signature for the C function should probably be "unsigned char*" instead of just "char*", but that won't make any difference to your assembly code.)

What I don't know about is how to set up the assembly function. I know the general mechanics of how the C code will call the assembly code, but the code you have shown me so far had additional directives that I've never used, so I don't know if there is a trivially easy way to set it up. I'll try to describe how we did it (and this article speaks to it as well: https://banisterfiend.wordpress.com/200 ... on-from-c/)

When the C code calls your assembly language code, it will push the arguments onto the stack. Then it will call your function. The arguments are pushed onto the stack in reverse order (and the stack grows downward), so the arguments will be in order left to right on the stack. It would look like this:


ESP +0 +4 +8 +12
Return address | Argument 1 (input buf) | Argument 2 (input length) | Argument 3 (output buf)


Unless there are special directives to set this up for you, what you would do is first save the EBP register by pushing it. This leaves the stack looking like this:


ESP +0 +4 +8 +12 +16
saved ebp | Return address | Argument 1 (input buf) | Argument 2 (input length) | Argument 3 (output buf)


Then you would mov ebp, esp. That allows you to access the parameters from the stack.
So: first argument would be at dword ptr [ebp+8], argument two would be at dword ptr [ebp+12], etc. In this case, they're all 32-bit values (I assume).

Now, as I said, you might not need to do all that. I'll see if I can find any references online whether masm has special directives to wrap your function and automatically handle it. But if not, that's how you'd have to do it.

The return value is expected to be in EAX. So the length of the encoded data must end up there.

Finally, since C code has variable arguments (potentially), only the caller knows how many were actually pushed. So you don't need to clean up the stack in your code (that is, remove the arguments). There are some calling conventions like the "Pascal" calling convention where you do, but you shouldn't have to for this.

Let me break it here and then pick up again to talk about the run length encoding.

jaynabonne
The run length encoding has two forms: if the two high bits of the byte are set, then the bottom six are the length, and the next byte after the count byte is the data byte to use. If the two top bits aren't both set, then the byte is just the data byte. So if you had this *encoded*:

C8 (11001000)
FF

the decoder would decode 8 0FFh bytes.

CF (11001111)
00

the decoder would decode 15 (00fh) 00 bytes

AA (10101010)

the decoder would decode a single 0AAh byte.

00 (00000000)

the decoder would decode a single 000h byte.

FF (11111111)
CC

the decoder would decode 63 (111111b = 03fh) 0cch bytes

Now, your assignment is to go the other way - to create the encoded data. A truly inefficient way would be to take each incoming byte and just slap 0c1h onto the front of it. So 1, 2, 3, 4, 9, 10 would become 0xc1, 1, 0xc1, 2, 0xc1, 3, 0xc1, 4, 0xc1, 9, 0xc1, 10. That is just literally saying each byte is a single byte. Somewhat ironically, that particular set of data *would* look like that, because there are no duplicates in the data stream there.

But if you had this data: 00 00 00 00 00 00 01 02 02 02 02 00 10, it's better to get this:

0c6h 00 0c1h 01 0c4h 02 0c1h 00 0c1h 10

than replicate all the bytes out (e.g. 6 copies of 0c1h 00).

The basic algorithm is to have a loop over the data - while you have data, keep going. You need to count identical bytes in a row in the source stream and write how many you get to the output when you hit some other byte (or the end of stream). There are some tricks to this:

1) You need to stop counting when you reach the end of the source buffer. If you have any data counted at that point, you need to flush it to the output buffer as the final count.
2) You need to stop counting when you reach 03fh (63). You can only encode 6 bits worth of count at a time, so when you reach that count, you need to flush the count/data pair to the output and then go back to counting as if you were starting afresh.
3) If you end up with a single byte to write to the output but its two high bits are both set (if reg & 0c0h == 0c0h), then you need to write it out as a repeated count sequence but with a count of 1. So if you have a single byte 47h, then you can write to the output as 47h. But if you have a single byte c6h, then you have to write it as c1h c6h. Otherwise the decoder would hit the 0c6 and think it's a count byte when you mean it to be a data byte, and it would get all confused.

I hope that's enough to digest for now. :)
Let me know what questions you have still or next.

jaynabonne
Oh, and I forgot to add: if you use this standard prolog in your function:

myproc proc
push ebp
mov ebp, esp


then be sure you have this at the end before you return:

    pop ebp
ret
endp

HegemonKhan
I got class now (evening classes), and when I get back it'll be late-nighttime, so, I'll get back to you tomarrow, trying to digest as much of it as I can, lol.

HegemonKhan
Jaynabonne wrote:The prototype you have looks wrong to me. I think it should be this:

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf);

It takes a char* input buffer and the length of that input buffer, it writes the encoded data into OutputBuf (which is also a pointer), and then it returns the length of the encoded data, so you know how big the encoded data ended up being.

It is correct to use BYTE for the data type; you'll be reading and writing bytes. (Technically, the signature for the C function should probably be "unsigned char*" instead of just "char*", but that won't make any difference to your assembly code.)


so... the ASM/C++ (whichever is actually calling/using/doing) the ASM procedure, is able to use the 'char' data type, correct?

(I know they're data type pointers: pointers to the 'data type' Objects/Classes, which is why I left them off in my post's question, but, as I, wasn't sure on what data type to use: 'char' vs 'byte/word/dword', so I was just asking about what data type I was suppose to be using. Sorry about the confusion with my ignoring of the pointer aspect of them. BTW, in ASM, are there pointers, byte *b, * operator instruction sets, or do we just use the indirect addressing ???)

but... you say it is correct to use 'byte'...

so, I'm confused... do I use 'char' for the prototype and the actual procedure's (declaration/initialization) header, but for the body, I use 'byte', or do I just use 'byte' everywhere (prototype and header and body), instead of 'char', or do I use 'char' for everything ???

-------

I'm not sure what you mean by directives, but we can use the old way of pushing-popping to-from the stack manually (along with using the 'call' instruction), or we can use the 'invoke' instruction, which will do the pushing-popping to-from the stack for us, we just need to add them as arguments-parameters. But, I'm not sure if this is what you're refering to...

then, for the procedure, we can use the 'uses' which will preserve the following (self inputted-written) listed registers for us, along with being able to specify any parameters we wish to use, in the header.

if you're refering to how C++ and ASM work together, (how the C++ calls the ASM procedure), I'm not sure myself exactly...

I've been looking at these resources so far:

https://courses.engr.illinois.edu/ece39 ... ixing.html
http://lavernasbrute.blogspot.com/2010/ ... -in-c.html
http://www.c-jump.com/CIS77/MLabs/M13hi ... cture.html

trying to find resource on the procedure syntax/format and its options for you:

http://www.masmforum.com/board/index.ph ... ic=14381.0 (this seems to have pretty good/informative posts by devndave, see his/her big post)

jaynabonne
I may have slightly mis-spoke: your C routine will pass pointers to char - the type that corresponds to char is byte. But what you receive on the stack is not a char, but rather an address to the buffer. So you'll get a pointer for the input buffer, a dword for the input size, and a pointer to the output buffer. You'll use the pointers to access bytes.

Proto and invoke are used to call into C function from MASM. You're calling into your assembly from C. :)

Looking up "uses" took me to what I was looking for. You can see the page here:

http://www.winasm.net/forum/index.php?showtopic=2083

So you might be able to use something like this (instead of doing the ebp management yourself):

.model flat, c  ; specify C calling convention

RlEncode PROC sourceBuffer:LPVOID, sourceSize:DWORD, outputBuffer:LPVOID

mov esi, sourceBuffer
mov ecx, sourceSize
mov edi, outputBuffer

; do stuff with them

ret
RlEncode endp

HegemonKhan
I'm still trying to digest the rest... just woke up... (I was tired... I'm getting old, argh), going to get something to eat first, for some brain food, for tackling this stuff, laughs. I'll get back to looking through the rest of your post, after I eat. I'll see what I can do on my own from it, and post further questions/help as I come to them, as I try to do what I can for now.

------------

we can't use the 'C' calling convention, as that would make this more easy for us, the prof wants us to learn to do this without using that (and other such) method(s).

here's what we can't do:

(1) in-line assembly in the C++ program
(2) .MODEL flat, C
(3) MyProc proc C
(4) specifying a start label in your end directive

-----------

this post by devndave seems to explain it well, showing/explaining both ways (old/manual and new/auto):

http://www.masmforum.com/board/index.ph ... #msg114921

even I'm understanding it pretty well now!

jaynabonne
Ah, ok. I understand now. So, yeah, you'll need to do it manually as I described before. That page I just sent you actually shows what the proc-with-parameters stuff turns into, assembly-wise. It's pretty much what I had detailed before - push ebp / mov ebp, esp / reference parameters via [ebp+offset].

HegemonKhan
I think I'm kinda somewhat getting the encoding/decoding part, except, how do you know what to do, as I don't know if we've got the data used for the encoding-decoding (I'll check if it's in the C++ program we're given)...

where does your '1,2,3,4,9,10' come from, or are those just example values?

------

maybe it'd be better if I describe as I understand it, and you can correct me on what I've got wrong...

JayNabonne wrote:The run length encoding has two forms: if the two high bits of the byte are set, then the bottom six are the length, and the next byte after the count byte is the data byte to use. If the two top bits aren't both set, then the byte is just the data byte. So if you had this *encoded*:

C8 (11001000)
FF

the decoder would decode 8 0FFh bytes.

CF (11001111)
00

the decoder would decode 15 (00fh) 00 bytes

AA (10101010)

the decoder would decode a single 0AAh byte.

00 (00000000)

the decoder would decode a single 000h byte.

FF (11111111)
CC

the decoder would decode 63 (111111b = 03fh) 0cch bytes


I think I get this...

actual data (decoded):

(this has larger size, so actually when/if decoding the encoded data, we're expanding the size of it)

FFFF-FFFF-FFFF-FFFF-0000-0000-0000-0000-0000-0000-0000-000X-AA00-3F3F...(62 more 3Fs = 31 more 3F3Fs) // actual form, I think...

// separated to match up with your example, for easier understanding for us humans:

FF-FF-FF-FF--FF-FF-FF-FF
00-00-00-00--00-00-00-00--00-00-00-00--00-00-00-XX
AA
00
3F.......................................................(x63)

encoded data:

(this has smaller size, thus the encoding of the data "decompresses" it)

C8FF-CF00-AA00-FFCC // actual form, I think...

// separated to match up with your example, for easier understanding for us humans:

C8FF
CF00
AA
00
FFCC

--------------

is this correct ???

--------------

and for what I need to do with ASM procedure is to iterate through the expanded (actual/decoded) data, looking for when the values change (when the count ends for doing the encryption for each of these segments of data), as this marks the different operations/whatever of the encryption algorithm I'm to do to encrypt (and thus compress) the data.

example pseudocode:

start:
store (new/next) value
cmp stored_value' bits (7+6), 11h ; to determine which operation I do
cmps buffer_data, stored_value ; checking for when the value changes (end of current data value's count)
loop start

is this correct ???

jaynabonne

where does your '1,2,3,4,9,10' come from, or are those just example values?


Yeah, they were just example values.

As far as the rest goes, it's more or less right. The main thing to help you avoid any confusion is to really just treat them as bytes. You're grouping them as if they're 16-bit words, and they're not. You take them a byte at a time. Your decoded data was close. It would actually be:

FF-FF-FF-FF--FF-FF-FF-FF
00-00-00-00--00-00-00-00--00-00-00-00--00-00-00-AA
00
CC.......................................................(x63)

(I don't know where the XX came from, but I think it had to do with you looking at it as 16-bit words.)

Similarly, the encoded data would be just:
C8 FF CF 00 AA 00 FF CC

Eight consecutive bytes...

And, yes, the basic algorithm is to count consecutive matching bytes and then write them the correct way. (e.g. a repeat count > 1 would be stored as the two byte form, count + data. A single byte would be stored in the single byte form - just the data byte - unless it has the two top bits set, in which case you need to store it as count-of-1 + data byte, to avoid confusing the decoder).

HegemonKhan
example pseudocode:

start:
store (new/next) value
cmp stored_value's bits (7+6), 11h ; to determine which operation I do
cmps buffer_data, stored_value ; checking for when the value changes (end of current data value's count)
loop start

is this correct ???

(I just edited this into my last post), I was too slow, lol

-------

ya, you understood about my 'X/XX/NA' correctly! I wasn't sure if the next value (AA) went into its lsb, or if I were to place a '00' into its hsb and the 'AA' would be in the next unit (byte):

{00}-00-00-00--00-00-00-00--00-00-00-00--00-00-(00)-[AA]
00
CC... (just caught that it's CC, and not 3F - oops on my part)

or

[00]-{00}-00-00--00-00-00-00--00-00-00-00--00-00-00-(00) ; shifted right
AA
00
CC... (just caught that it's CC, and not 3F - oops on my part)

but, you already answered this now, just explaining what my confusion was, which you correctly understood and answered already.

---------

my mind is still 'bit manipulation' focused from that arithmetic lab and test, I've jsut done... being able to work with bytes (reg's subdivisions: ax, ha, la), makes this much easier than the bits within a byte... lol

jaynabonne

ya, you understood about my 'X/XX/NA' correctly! I wasn't sure if the next value (AA) went into its lsb, or if I were to place a '00' into its hsb and the 'AA' would be in the next unit (byte)


Just to be clear... :) If by "lsb" you mean "least significant byte" (as opposed to least significant bit), then I wouldn't even look at it that way, because that implies a multi-byte value. But you don't need to combine the bytes with each other in any way. Just keep them bytes. So 'AA' goes into the next byte address, period, just as all the previous bytes did, one by one.

example pseudocode:

start:
store (new/next) value
cmp stored_value' bits (7+6), 11h ; to determine which operation I do
cmps buffer_data, stored_value ; checking for when the value changes (end of current data value's count)
loop start

is this correct ???


What I would do first is imagine how you would do it by hand. Forget bits and all that. Let's say I gave you the following data:

1 1 6 6 6 8 8 2 4 4 4 4 5 5 9

You would first look at the 1, then you'd look at the 1 after it and count two 1's. Then you'd look at the next value, and it's 6, which doesn't match 1. So then you output what you have so far:

"I have two ones."

Then you continue on with the 6's. You have one so far. So you look at the next and it's still 6, so you bump your count to 2, you look at the next and you bump your count to 3. Then the one after that is 8, so now you know how many 6's you have. So you then say:

"I have three sixes."

And you'd proceed on.

The counting you all do in registers. You need to know what value you're looking for and how many you have so far. The part where you "say" how much you have is where you write it to the output buffer. And what you write to the output buffer depends on what you have. Multiple bytes get written as the two-byte form (count + data). A single byte without the two high bits set gets written as a single byte. And a single byte with the two top bits set gets written as count-of-1 + data. But that happens *after* you have counted! The first thing you need to do is count. Then once you've counted, you figure out what to write to the output buffer.

One thing you may notice (which makes it slightly tricky) is that sometimes you're reading a byte to get the first value to compare, whereas other times, you're reading bytes to compare to what you have. To me that implies a two part loop: you have an outer loop that you jump back to when the value changes, and you'd have an inner loop you loop back to while you still have matching values.

Something like this (IGNORE IF YOU WANT TO WORK IT OUT YOURSELF):





outer:
read next byte
inc position in buffer
dec source bytes left - if 0, jump to finish.
set count to 1
inner:
compare next value in buffer (buffer + offset + 1)to current value
if not equal jump to write
inc position in buffer
inc count
dec source bytes left - if 0, jump to finish.
jmp inner

write:
write the count + data to buffer as detailed above
jmp outer

finish:
write the count + data to buffer as detailed above <- could be your own subroutine since you need it in two places.
load up eax with the final count
return


If you use esi for the input buffer and edi for the output buffer, then you can use lodsb and stosb. But you don't have to.

HegemonKhan
ah thanks, I think I had this notion myself, just didn't go into the full-entire individual steps in my pseudocode (it was just to be a brief main point skeleton pseudocode, I was aware that there were more steps involved, just lazy and didn't want to write them all in), so I was already knowing how to do it generally.

though, I wasn't quite sure on how/what I was to do with my code after the counting (the storing of it as: <count><value> or <value>, and how to handle the different cases of it), so your explaining of that specific was helpful... and I'd probably not realize the need for nested looping (2 loops, like when getting/working with a 2D array's/matrix' values)... I would've had a lot of grief and/or bugging of you, so this has helped me avoid the grief and the bugging of you about what to do with once I got the counting, lol.

jaynabonne
Let me know if you have any more questions. :)

HegemonKhan
I should be good now... i think I got the actual procedure's operations down... baring any unforeseen issues as I try to do it... so, at this point, my only possible other questions might be with the setting up of the procedure (its heading and etc), but let me see what I can do, if I can get it all working, and if not then I'll ask the questions I'll have. Give me a day or two, to see if I can do this on my own, unless I run into problems, then I'll be posting those questions right away, lol. It'll probably take me a day or 2 to get this assignment done, baring any issues that might arise. I'm not the fastest programmer yet, especially when I got to work out how to do the code, and I need to take breaks as the sitting for too long gets irrating (I don't have the best chair), and I just get tired too from staring at the computer screen for too long and from also just from coding, takes a lot of brain power, as I'm not that smart.

Again, Thank You, Jay! You really explain these things well, helping me get the concept down/understood really well, along with helping me fixing or troubleshoot the actual code implementation too. You really helped me understand this assembly programming! I've been getting the assembly and computer architecture stuff pretty well, from the class, but the class doesn't really help with how to do these labs, which are totally new to me, so I've had a hard time doing them on my own, as I often don't know where to begin. But, you've been giving me that initial explanation or starting point hint/push, and from that, I'm then able to generally get and do the lab mostly on my own. So, I'm pretty happy with myself, baring my poor test taking and some assignments, hurting my grade, grr (I used to be so good at doing tests, don't know what happened in my getting old/older, lol. Guess, I'm losing my test-taking brain cells, sighs). In terms of the material though, I think I'm learning and understand it, which is what matters for me. I can't be a programmer if I don't know how to program, lol. So, I thank you for helping me with that initial push/hint/understanding of how these labs work, along with the class lectures and materials and etc, I am doing to not bad in learning this assembly and computer architecture (I was able to get the boolean algebra, those k-maps, and etc stuff on my own, ya!). As all the concepts, tactics, designs, and methods that I've learned of the high level languages, from quest, didn't prepare me for the NEW/UNIQUE concepts, tactics, methods, and designs that are involved with low level languages, like assembly. This is why I needed that initial explanation/help/hint/push, as I never learned of opcode/instructions and bit arithmetic/manipulation designs from quest, and my C++ and Java classes. And this (assembly) class unfortunately never really helped with this transition into these new concepts and designs. We just cover the the syntax, formatting, and the usage of the various instruction sets/commands, but not the ceptual designs of doing programming in assembly. It's hard if you don't already know of those conceptual designs. Hard to intuitively come up with them or realize them, on your own, out of the blue (at least for my level of intelligence, anyways).

HegemonKhan
after 3 days... and even looking at your code (thanks by the way, I needed it as I would have completely not have figured out what I hope I got figured out now, argh), I think I finally got hopefully some functional operational logic... grrr. Initially, the more I tried to understand it, the more confused I got, 3 days of working on just the operational logic... HK is very stupid, sighs.

Anyways, if you could look at the operational logic (the procedure/the procedure's algorithm), and see if it looks like it works, or if I got some logic and/or syntax and/or (bit) arithmetic/manipulations issues with it. This is my best guess after 3 days of working on it.

--------

after that, then we can tackle getting the rest of my code correct for this C++ reading/usage of my ASM procedure stuff.

--------

anyways, here's my entire code (there's some redundancy and/or mis-match of things, as I just added them for notes as I read/researched, not knowing what to do with them, or if I even need them or not):

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted
; Due: 5:00 pm, Wed., April. 6, 2016

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; Credit (those who helped me)
;------------------------------------------------------------------------------

; Online person Jay
; (various online webpage resources, that I need to add in here still)
; (a few colleagues' help too)

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to encode (compress) a data file.

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

PUBLIC RLE_Encode

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; Prototypes
;------------------------------------------------------------------------------

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf) public

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

NULL_POINTER EQU 00h

;***********
; VARIABLES
;***********

heading byte "Due: 5:00 pm, Wed., April. 6, 2016",
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

credit byte "Credits (those who helped me): ", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED, \
"Online person Jay", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to encode ", \
"(compress) a data file." \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

push eax
call RLE_Encode

;**********************
; Terminating Program
;**********************

Finish:

invoke ExitProcess, NULL_POINTER

Main endp

;************
; Procedures
;************

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf) proc public \
stdcall uses ebx, ecx, edx, esi, edi, ebp, esp

push eax

; push ebp
; movzx ebp, esp
; I ran out of registers, not sure what to do, if I can't directly use esp if I need to, if I need to use ebp (storing esp)
; I'm using ebx for tallying-storing the each-time-counts
; I'm using ebp for tallying-getting the length of edi, would 'lengthof edi' work for this instead, or no?
; I'm using eax for storing the each-time-initial value
; I'm using ecx for subtracting the length of the buffer ("EoVs: End of Values -- in the input buffer")
; I'm using edx for storing the the next value, to compare more efficiently (reg:reg) with eax (storing the each-time-initial value)
; I could do reg:mem for the comparison (not using/freeing up the edx), I think I can design-wise anyways, but this would be less efficient than doing reg:reg

movzx esi, InputBuf
movzx ecx, InputLength
movzx edi, OutputBuf

xor edx, edx

Outer_Loop:

movzx eax, byte ptr [esp][esi]
movzx ebx, 01h
sub ecx, 01h
cmp ecx, 00h
je Single_Byte_One
add esi, 01h

Inner_Loop:

movzx edx, byte ptr [esp][esi]
cmp eax, edx
jne Single_Byte_One

add esi, 01h
add ebx, 01h
sub ecx, 01h

cmp ebx, 3Fh
je Multiple_Byte

cmp ecx, 00h
je Multiple_Byte

jmp Inner_Loop

Multiple_Byte:

add ebx, 0C0h
movzx [esp][edi], byte ptr ebx
add edi, 01h
movzx [esp][edi], byte ptr eax
add edi, 01h
add ebp, 02h

cmp ecx, 00h
je Finish

jmp Outer_Loop

Single_Byte_Two:

movzx [esp][edi], byte ptr 0C1h
add edi, 01h
movzx [esp][edi], byte ptr eax
add edi, 01h
add ebp, 02h

cmp ecx, 00h
je Finish

jmp Outer_Loop

Single_Byte_One:

cmp ebx, 01h
jne Multiple_Byte

cmp ah, 0Ch
jae Single_Byte_Two

movzx [esp][edi], byte ptr eax
add edi, 01h
add ebp, 01h

cmp ecx, 00h
jne Outer_Loop

Finish

movzx eax, ebp

pop ebp
pop eax

ret

RLE_Encode endp

;*************************
; Program End/Termination
;*************************

end Main


----------------------------------------

P.S.

here's some of my colleagues discussing about the assignment (about trying to get the C++ program to, call/link to, the masm32 procedure):

"I am undergoing great difficulty in trying to call an asm file in c++ without using c as a language specifier. I keep getting a Linker Error due to multiple conflicting symbols. Anyone know how to fix this problem without copying and pasting the disassembly into my project? (redacted)"

"I am also having the same problem. From what I've looked up, the object files need to be linked but I'm not sure how to do this so I emailed Professor (redacted). I'll try and post his reponse here when he replies. (redacted)"

"After a lot of struggle I was finally able to find a solution to this problem. The problem is because of trying to use parameter lists in the procedure declaration. If you try to do this, the assembler needs to know what language convention it should follow for use with the parameters. Because we are not allowed to specify a language type of the project, the only option is to not use the built in parameter lists. This means you need to manually get your parameters off of the stack. You can do this using the ebp register inside your procedure. At the location of [ebp + 4] we will find the return address for the procedure, at the location [ebp + 8] we will find the parameter that was pushed last. For the project, the parameters are pushed from right to left, so the parameter at [ebp + 8] will be the address of the input buffer. You can then get the other parameters by adding the number of bytes in the data type. (redacted)"

"I'm having difficulty with this as well, I'm not able to get past the linker error despite avoiding the parameter list within the procedure. My instinct is that the C++ compiler won't recognize the procedure as the proper function that it is looking for since the parameters aren't defined. I might be wrong, but if someone else has any ideas on how to get past this I would really appreciate it. (redacted)"

"
The linker should still be able to link correctly even without the parameters. Here is what I have for my procedure prototype and the start directives, hopefully this will help:

.586
.MODEL flat
_RLE_Encode proto
.data
.code
_RLE_Encode proc public
(redacted)
"

"You may not need the underscore, as I believe that the default (via not specifying a different one) call convention is stdcall, which will place the underscores automatically for you, but I could be wrong, as I'm not sure if this still applies for when this MASM32 procedure is to be called by the C++ program. (redacted)"

"I don't think so because when I remove the underscores I get linker errors. If you have problems without them I would suggest trying it with them. (redacted)"

"
I'm not sure if the 'uses' means that we need to do more displacement (to get past these stack items manually), or not (MASM32/C++ handles it automatically for us).

I'm also not sure if we are to push the registers a second time to the stack, when using the 'uses' too. I'm still confused at exactly how the MASM32 to C++ works in regards to the stack. My understanding is that the stack is the connection/bridge between the MASM32 and C++, for them to work together. So, I'm not sure what needs to be pushed (or pushed again) or possibly not need to be pushed.

I guess I could just not use the 'uses', and manually push/pop them instead, along with applying the correct additional displayment required.
(redacted)
"

"
I don't think you will need to use any extra displacement because of the USES. Because of the way the stack frame is set up you can access the parameters using [ebp + offset] and then the uses will modify esp and not ebp.

For exactly what to do with USES, that is up to you. Using USES is just syntax sugar, you can do exactly what USES does by hand, if you want to. All USES does is translate into a series of pushes and pops. If you write "USES ebx ecx edx", the assembler will translate this into:

push ebx
push ecx
push edx

;The rest of your procedure goes here:

pop edx
pop ecx
pop edx
(redacted)
"

jaynabonne
So some things.

First, I don't think you need to use movzx so much. You only need to use that when you are reading a smaller byte value than the target is, so that you need it to be "zero extended" to the larger type (e.g. loading a byte into a 32-bit register). So, for example, you would never use "movzx ebp, esp", because ebp and esp are the same size. You would just use "mov ebp, esp".

And something like "movzx [esp][edi], byte ptr ebx" is wrong on a number of levels. First, you don't want to be using esp at all, example to set up the stack frame in the beginning. edi will point to the output buffer, period. There are no additional offsets needed. Just [edi]. And then this is using "movzx", which is an *extending* move, when in reality you just want to store a byte into a byte location. So this can all be reduced to:

mov [edi], bl

I don't even know if you can specify "byte ptr ebx". It's just the low half of the register, which is what bl is (which is clearer to use).

As far as writing out the data, you can't know in your upper loop which case it will be (single, muliple) because that is determined by the count. For example, you have:

   movzx   edx, byte ptr [esp][esi]
cmp eax, edx
jne Single_Byte_One

First, it should be "mov dl, byte ptr [esi]" :)
Second, you have the code jumping straight to Single_Byte_One, which is outputting a single byte. But you don't know if it will - you have to see what your count is (1 or more than 1) to determine how to output the data. If the count is greater than one, then it's a multi byte case.

A tip: while you can accumulate the output buffer size, you can get away with not doing that, if it helps. You can simply subtract OutputBuf from edi when done to see how many bytes you wrote:

mov eax, edi
sub eax, OutputBuf


(It also frees up ebp to be used for your frame pointer. Even if you used the built-in parameters and calling convention code, I bet that using ebp is off limits, since it's used to hold the frame pointer.)

There is a lot that's good here. If you can straighten out the outputting, so that you have a single "write" entry point that checks the count and the data byte and writes appropriately, then it will be quite far along.

One thing to keep in mind is that you are dealing with bytes here. You don't need to do all of this zero extension to 32-bit. Use the eight bits registers you have. A single 32-bit eax or ebx can hold two 8 bits values (e.g. al and ah, bl and bh). That should free up some registers if you find you need them.

And don't be afraid to use "inc" and "dec" instead of adding and subtracting one. Not only is it more compact (and more efficient and clearer), but if anyone who knows assembly ever looks at your code, they will wonder why it was done the other way. :)

As far as the comments you wrote from others, getting the naming convention correct is important as far as linking goes. But there is another consideration: with a C calling convention, the caller cleans up the stack. With a stdcall calling convention, the *called* function (that is, your code in this case) needs to clean up the stack before returning. It's not hard, but it needs to be done. Otherwise, you'll end up with a crash eventually. If you can get your function declared without stdcall, it might be easier. But if not, it's not too hard to make it work (you just need to specify how many bytes to remove from the stack when you return).

I don't know how much help you want with this, so I'll stop at this point. But we could go through the code in more detail if you want.

HegemonKhan
I still don't really understand the stack and/or the stack frame, and its ordering/displacement/offset/address stuff completely.

--------------

I'm a bit confused on the 'mov ebp, esp' in what exactly it is for, as I think you can say I can do this:

mov reg, [reg]
example: eax, [esi]

whereas, I thought I'd need to do this? :

mov eax, [ebp][esi]

or even this:

mov eax, [ebp+12] ; mov eax, [param1]

-------

also, do I need to do this? :

mov ebp, esi
add [ebp], displacement

to get past the slots for the stack frame pointer, return address pointer, parameters (if I end up using or required to use them), and etc:

(if I understand this stack construction/deconstruction/ordering stuff)
(assumming they're all 'byte' in size...)
[ebp+0] = stack frame pointer
[ebp+4] = return address
[ebp+8] = param1
[ebp+12] = param2
etc etc etc

------

I use 'add' as I thought it was slightly more efficient than 'inc'... I keep getting opposite statements... inc is more efficient, no, add is more efficient, no, inc is more efficient... it feels like the nutrition/science "surveys", one survey says this food is good for you, another survey comes out saying that food is bad for you, another survey comes out saying that food is good for you... etc etc etc

I wish I could get some final clarification on which is more efficient... sighs. I used 'inc' originally as I thought it was more efficient, then I'm told that it is actually 'add' that is more efficient, so I use 'add'. Then I'm told that 'inc' is more efficient. I use inc again, then I'm told 'add' is more efficient, arg

----------------

I think my 'Single_Byte_One' will check and jump it to the correct label operation (Multiple_Byte or Single_Byte_Two)

Single_Byte_One: this will write single bytes which don't have the upper 2 bits set, or act as a re-direct-hub to the labels to handle the other two cases (if there's multiple bytes: count > 1, or if the single byte has the 2 upper bits set, which means it must write 2 bytes).

Single_Byte_Two: this is for writing single bytes which have the two upper bits set, thus it's actually writing 2 bytes into the output buffer.

Multiple_Byte: this is for writing multiple bytes (having a count > 1)

all 3 of these label-operations, check if it's done reading/writing, jumping to Finish (I couldn't figure out a more efficient way/design, sighs)

I think my logic is right... as to which I have the jumps going to, but please notifify me if any are incorrect.

if the Outer_Loop hits ecx = 0, then I know it's only a single byte (thus it jumps to the Single_Byte_One), however, it can have its upper bits set, which means I need it to jump to Single_Byte_Two, to handle the writing of its 2 bytes: [C0 + count][value of initial slot value stored in eax, which I have the Single_Byte_One being able to handle, it'll send/jump to Single_Byte_Two if the upper bits are set]

if the Inner_Loop fails the comparison, it can be a single byte (which can either have its upper bits set or not) or a multiple byte, and thus it jumps to Single_Byte_One (which will check if it needs to redirect it to Multiple_Byte or Single_Byte_Two, or not).

if the Inner_loop doesn't fail the comparison, then it *IS* a multiple byte, and is thus jumped to that label-operation (Multiple_Byte) if it has reached the end of its available looping (3Fh, as 40h would cause C0, C0+40, to overflow into the next byte, right? 1100:0000 + 0100:0000 = 0000:0001::0000:0000) or if ecx = 0

---------

btw, is this correct bit-manipulation/arithmetic, which I used ?, will this work correctly? :

for writing the set upper bits single byte into its 2 bytes:

for Single_Byte_Two and Multiple_Byte:

[0xC:0 h + (count value) h] // :: [2nd byte that holds initial value stored in eax, X:X h or XXXX:XXXX y]

0xC:0 h to 0xF:F h

0xC:0 h + 3:F h = 0xF:F h

(0xC:0 h = 1100:0000 y)
+ (3:F h = 0011:1111 y)
___________________

[0xF:F h = 1111:1111 y] // :: [2nd byte that holds the initial value stored in eax, X:X h or XXXX:XXXX y]

right ???

do I need to use 'adc' instead, how do I handle the overflow from the low nibble (count value) into the upper nibble ??

or do I not use 'add/adc', and instead just mov the values into the byte's nibbles (high and low), but how would I deal with moving the overflow from the lower nibble into the upepr nibble ???

-----------------------------

here's my new code:

(I'm a little confused between increasing the address itself vs working with, adding/moving, the value/s at that address, in regards to whether I enclose it within the brackets or not)

(for now, I'm just leaving it as the movzx 32-bit regs and byte ptr stuff usage, I'll go back later and see if I can convert it successfully over to using the 16 and 8 bit sub-division regs, and whether I need the 'byte ptrs' or not, later)

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted
; Due: 5:00 pm, Wed., April. 6, 2016

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; Credit (those who helped me)
;------------------------------------------------------------------------------

; Online person Jay
; online resources
; colleagues

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to encode (compress) a data file.

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

PUBLIC RLE_Encode

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; Prototypes
;------------------------------------------------------------------------------

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf) public

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

CARRIAGE_RETURN EQU 0Dh
NEW_LINE_FEED EQU 0Ah

NULL_POINTER EQU 00h

;***********
; VARIABLES
;***********

heading byte "Due: 5:00 pm, Wed., April. 6, 2016",
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

history byte "Version 1.0", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

credit byte "Credits (those who helped me): ", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED, \
"Online person Jay", \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

purpose byte "This program's purpose is to encode ", \
"(compress) a data file." \
CARRIAGE_RETURN, NEW_LINE_FEED, NEW_LINE_FEED

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

Main Proc

push eax
call RLE_Encode

;**********************
; Terminating Program
;**********************

Finish:

invoke ExitProcess, NULL_POINTER

Main endp

;************
; Procedures
;************

DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf) proc public \
stdcall uses ebx, ecx, edx, esi, edi, ebp, esp

push ebp
movzx ebp, esp

push eax

movzx esi, InputBuf
movzx ecx, InputLength
movzx edi, OutputBuf

xor edx, edx

Outer_Loop:

movzx eax, byte ptr esi
movzx ebx, 01h
add [esi], 01h
sub ecx, 01h
cmp ecx, 00h
je Single_Byte_One


Inner_Loop:

movzx edx, byte ptr esi
cmp eax, edx
jne Single_Byte_One

add [esi], 01h
add ebx, 01h
sub ecx, 01h

cmp ebx, 3Fh
je Multiple_Byte

cmp ecx, 00h
je Multiple_Byte

jmp Inner_Loop

Multiple_Byte:

add ebx, 0C0h

movzx edi, byte ptr ebx
add [edi], 01h

movzx edi, byte ptr eax
add [edi], 01h

cmp ecx, 00h
je Finish

jmp Outer_Loop

Single_Byte_Two:

movzx edi, byte ptr 0C1h
add [edi], 01h

movzx edi, byte ptr eax
add [edi], 01h

cmp ecx, 00h
je Finish

jmp Outer_Loop

Single_Byte_One:

cmp ebx, 01h
jne Multiple_Byte

cmp ah, 0Ch
jae Single_Byte_Two

movzx edi, byte ptr eax
add [edi], 01h

cmp ecx, 00h
jne Outer_Loop

Finish

mov eax, [edi]
sub eax, [OutputBuf]

pop eax
pop ebp

ret

RLE_Encode endp

;*************************
; Program End/Termination
;*************************

end Main

jaynabonne
You're right - your code was handling the counts. My bad, and I apologize.

As far as the stack goes, you're mixing two different metaphors in your code. By listing arguments after your function (e.g. the "char *InputBuf" and all the rest, you're telling the assembler to generate the frame prolog code for you. As shown on that page I sent you before, if you use that, then it will automatically put in for you the "push ebp / mov ebp,esp", etc. stuff. When you use "movzx esi, InputBuf" (and, again, it should just be "mov" - you're not extending a lower bit value to a higher bit one. You're just moving a 32-bit value around), under the covers, the assembler is turning that into "mov esi, [ebp+8]". Since you indicated to me that you can't use the calling convention stuff, you don't want to be doing that anyway. If I understand it properly, you need to get rid of the arguments and return type in your proc and handle it yourself. So you would have:

RLE_Encode proc public
push ebp
mov ebp, esp

push eax

mov esi, [ebp+8]
mov ecx, [ebp+12]
mov edi, [ebp+16]

instead of using the assembler-generated parameter names and the like.

I use 'add' as I thought it was slightly more efficient than 'inc'


"inc" exists as a more efficient form of "add 1" (or it used to, at least). It takes fewer bytes (you don't need the value being added, for one thing), so it's fewer memory cycles to read, and it's microcode is dedicated to simply doing the add 1 as opposed to the more general purpose addition of an arbitrary value. Unless CPU micro architecture has changed drastically when I wasn't looking... :) Consider the CPU emulation you did before - which would be easier to implement, a single byte "inc" instruction, or an "add" instruction where you have to read the opcode and then read the value to add after it? (But I do see the debate, and things are not always as they seem: http://stackoverflow.com/questions/1338 ... an-inc-x86 To be honest, go with what you want, especially if some think add is ok. It just seems odd to me, especially in cases where you're not concerned about optimization, to not use the instruction that seems clearer. Plus I have this automatic "noooo" when I see the add. Times may have changed.)

do I need to use 'adc' instead, how do I handle the overflow from the low nibble (count value) into the upper nibble ??


You can't really even address things at the nibble level as far as addition and subtraction goes. Those operations operate on a minimal unit of a byte, which is 8 bits as a unit. The thing you need the carry for with adc is propagating carry across registers or memory locations when doing multi-byte/word/whatever operations.

You're generally handling the 0xC0 fine. You can either add 0xc0 or "or" the bits in. I personally prefer the "or" since it's clearer that I'm setting too bits to those values as opposed to addition, which has other semantics to me. But as long as you know the top two bits are clear, the add *works" just as well.

(for now, I'm just leaving it as the movzx 32-bit regs and byte ptr stuff usage, I'll go back later and see if I can convert it successfully over to using the 16 and 8 bit sub-division regs, and whether I need the 'byte ptrs' or not, later)


That's fine. Just remember that you don't need to use "movzx" *unless* you're dealing with a byte ptr that needs to be extended to a higher-bit (16 or 32) register. You don't want/need to use it just to move 32-bit values around, and even if the compiler allows it, it's confusing to someone looking at the code, since movzx has the express purpose of moving from lower to higher bits widths, which is not the case for something like "mov ebp, esp".

(In fact, if you look this page, you'll see there is no supported variant of movzx for 32-bit values. If it assembles, then the assembler must be correcting it for you: http://x86.renejeschke.de/html/file_mod ... d_209.html)

HegemonKhan
I'm really confused by and/or don't understand this stuff very well, so I have no idea what is right and what is wrong, if you could take a look at it and correct (and explain) it, I'd be appreciative:

new code:

.586

.MODEL flat, stdcall

option casemap :none

public _RLE_Encode ; Do I need this?

dword _RLE_Encode (char *InputBuf, dword InputLength, char *OutputBuf) public ; Is this correct? I keep this like this, or no?

.STACK 4096

.DATA

;blah stuff

.CODE

; main proc

; what do I do here, do I even need this stuff, do I just have the encode procedure ???
; push eax ; do I need this here? do I want to push eax or not, as it's suppose to be used by the C++ to return the OutputBuf length, I think
; call _RLE_Encode ; or am I too use 'invoke' instead?
; Finish:
; invoke ExitProcess, 00h

; Main endp

_RLE_Encode proc public ; is this correct header?

; is this stuff below correct? especially whether I use the 32 bit regs, 16 bit regs, or the 8 bit regs, whether I need to use data type pointers, whether I bracket a reg or not, and etc... I'm just so confused now, I don't know what's right and what's wrong, I'm just not understanding this stuff that well.

push ebp
mov ebp, esp

; do I need to add/sub from esp/ebp ???

; push the other regs ; does this go here?
; push eax ; do I need this here? do I want to push eax or not, as it's suppose to be used by the C++ to return the OutputBuf length, I think
; push the other regs ; does this go here?

; what do I push for the 'return address' byte slot: [ebp+4] ???

mov esi, [ebp+8] ; will it be more than '+8' if I push the other regs?
mov ecx, [ebp+12] ; will it be more than '+12' if I push the other regs?
mov edi, [ebp+16] ; will it be more than '+16' if I push the other regs?

Outer_Loop:

mov al, esi
mov bl, 01h
inc [esi]
dec ecx
cmp cl, 00h
je Single_Byte_One

Inner_Loop:

mov dl, esi
cmp al, dl
jne Single_Byte_One
inc [esi]
inc ebx
dec ecx
cmp bl, 3Fh
je Multiple_Byte
cmp cl, 00h
je Multiple_Byte
jmp Inner_Loop

Multiple_Byte:

or bl, 0C0h
mov edi, bl
inc [edi]
mov edi, al
inc [edi]
cmp cl, 00h
je Finish
jmp Outer_Loop

Single_Byte_Two:

mov edi, 0C1h
inc [edi]
mov edi, al
inc [edi]
cmp cl, 00h
je Finish
jmp Outer_Loop

Single_Byte_One:

cmp bl, 01h
jne Multiple_Byte
cmp ah, 0Ch
jae Single_Byte_Two
mov edi, al
inc [edi]
cmp cl, 00h
jne Outer_Loop

Finish:

mov al, [edi]
sub al, [OutputBuf]

; do I need to add/sub from esp/ebp ???

; pop other regs ; does this go here ?
; pop eax ; does this even get pushed/popper or not?
; pop other regs ; does this go here ?
; pop ebp

; ret ; do I need to add/sub a value here ???

_RLE_Encode endp

; end Main


-----------------------------

off-topic:

about my confusion over 'inc' vs 'add', and the differing statements I've been getting from many various sources (confusing me as to which is better):

if I remember right, I think the prof said that (at least with this 32-bit build/version of MASM), that 'add' is (slightly) better/faster in terms of actual execution time efficiency, and I also think that the 'inc' requires more actual steps/operations underneath (it's basically doing the 'add' optiminally, but you got the overhead of going from the 'inc' to its own add operations), though of course for the human and the file size, 'inc' is only 2 opcodes, whereas 'add' is 3 opcodes. If I understand this stuff right, and remember correctly. So, from this I had assumed thus that 'add' was more efficient (to me actual execution is main meaning of efficiency, unless you got a big or professional program where file size and etc matters more than execution speed), so that's why I've been using 'add' over 'inc'... but it does seem to be a bit of a grey area, as efficiency seems to be a bit arbitrary, unless you've got a glaring inefficiency vs efficiency comparison.

jaynabonne
Answers and corrections inline below, preceded by "----". (I haven't had time to look over all the logic, being at work. I just went for syntax, etc issues that jumped out at me.)

.586

.MODEL flat, stdcall

option casemap :none

public _RLE_Encode ; Do I need this?

dword _RLE_Encode (char *InputBuf, dword InputLength, char *OutputBuf) public ; Is this correct? I keep this like this, or no?
---- I don't think you need this. I could be wrong, but you don't need to tell the compiler params, since you're doing it yourself.
---- Having said that, when I did assembly language, we didn't have that feature. I could look at it more, if you wish.

.STACK 4096

.DATA

;blah stuff

.CODE

; main proc

; what do I do here, do I even need this stuff, do I just have the encode procedure ???
; push eax ; do I need this here? do I want to push eax or not, as it's suppose to be used by the C++ to return the OutputBuf length, I think
; call _RLE_Encode ; or am I too use 'invoke' instead?
; Finish:
; invoke ExitProcess, 00h

---- You shouldn't need that. You're not calling your function from here. The "main" entry point will be supplied by the C code,
---- which will be calling this as a subroutine.

; Main endp

_RLE_Encode proc public ; is this correct header?

---- I think that looks ok.

; is this stuff below correct? especially whether I use the 32 bit regs, 16 bit regs, or the 8 bit regs, whether I need to use data type pointers, whether I bracket a reg or not, and etc... I'm just so confused now, I don't know what's right and what's wrong, I'm just not understanding this stuff that well.

push ebp
mov ebp, esp

; do I need to add/sub from esp/ebp ???

---- You would only need to sub further from esp if you were using local (temporary/auto) variables on the stack.
---- You shouldn't need to (you should be able to get away with using registers alone), so you can skip that part.

; push the other regs ; does this go here?

---- pushing other registers would go here.

; push eax ; do I need this here? do I want to push eax or not, as it's suppose to be used by the C++ to return the OutputBuf length, I think

---- you will be returning the size in eax, so there's no point in saving it. (It will be lost regardless.)

; push the other regs ; does this go here?

---- Yes, save any other registers here. Not ebp, of course, since it's already saved. :)

; what do I push for the 'return address' byte slot: [ebp+4] ???

---- The return address is already on the stack, pushed there when the C code made the call. That's what allows your ret to go home when done.

mov esi, [ebp+8] ; will it be more than '+8' if I push the other regs?
mov ecx, [ebp+12] ; will it be more than '+12' if I push the other regs?
mov edi, [ebp+16] ; will it be more than '+16' if I push the other regs?

---- You have set ebp to be esp *before* you pushed the registers, and the stack grows downward. So the +8, etc will be the values before your push.
---- So no pushing afterwards will have any effect on the offsets you use.
---- If for some reason you had set ebp *after* you pushed all the registers (not common or recommended), then you would need to change the offsets
---- to account for that, since esp would have changed drastically - which is probably why people don't do it.

Outer_Loop:

mov al, esi ---- mov al, [esi] ; read what esi points to
mov bl, 01h
inc [esi] ---- inc esi ; inc the esi pointer value (not its contents)
dec ecx
cmp cl, 00h ---- this is a 32-bit count, so it would be ecx. However, dec sets the z flag when ecx reaches 0, so you don't even need this line.
je Single_Byte_One

Inner_Loop:

mov dl, esi ---- mov dl,[esi]
cmp al, dl
jne Single_Byte_One
inc [esi] ---- inc esi
inc ebx
dec ecx
cmp bl, 3Fh
je Multiple_Byte
cmp cl, 00h ---- using full 32-bits for count, so this would be ecx.
je Multiple_Byte
jmp Inner_Loop

Multiple_Byte:

or bl, 0C0h
mov edi, bl ---- mov [edi], bl
inc [edi] ---- inc edi
mov edi, al ---- ov [edi], al
inc [edi] ---- inc edi
cmp cl, 00h ---- cmp ecx, 0
je Finish
jmp Outer_Loop

Single_Byte_Two:

mov edi, 0C1h ---- mov byte ptr [edi], 0c1h
inc [edi] ---- inc edi
mov edi, al ---- mov [edi], al
inc [edi] ---- inc edi
cmp cl, 00h ---- cmp ecx, 0
je Finish
jmp Outer_Loop

Single_Byte_One:

cmp bl, 01h
jne Multiple_Byte
cmp ah, 0Ch ---- cmp al, 0c0h (nibble confusion? If ah = 12h and al = 34h, then ax would be 1234h. So c0h is just in al. It doesn't touch ah)
jae Single_Byte_Two
mov edi, al ---- mov [edi],al
inc [edi] ---- inc edi
cmp cl, 00h ---- cmp ecx,0
jne Outer_Loop

Finish:

mov al, [edi] ---- mov eax, edi ; count is full 32-bit value. You only would use the byte registers for the data bytes, not the counts.
sub al, [OutputBuf] ---- sub eax, [ebp+16]

; do I need to add/sub from esp/ebp ???

---- no adjustment needed, since you have no locals

; pop other regs ; does this go here ?

---- yes, pop the registers you pushed here (but not ebp yet)

; pop eax ; does this even get pushed/popper or not?

---- if you pushed and popped eax, you'd lose your return value. So, no.

; pop ebp

; ret ; do I need to add/sub a value here ???

---- if this is C calling convention, then you would just use "ret", as the caller will pop the arguments
---- If it's stdcall, then you need to clean the stack, so it would be "ret 12" to remove the pushed arguments.

_RLE_Encode endp

; end Main

jaynabonne
Looking at the calling conventions a bit (https://en.wikibooks.org/wiki/X86_Disas ... onventions), it looks like you're using the cdecl ("C") calling convention (based on the name of your proc just having a leading underscore). So you can ignore the part about "ret 12".

If you were to use stdcall convention, you would also have to reverse the offsets you use to access the parameters relative to ebp, since stdcall pushes the parameters right-to-left instead of left-to-right. (I had a panic when I remembered this. But I think you're ok.)

Yes, the wonders of integrating different languages. I guess that's why Microsoft added the calling convention and parameter types to masm - to hide all the ugliness.

HegemonKhan
Thank you so much Jay, I was really confused (I had like all the brackets/no brackets backwards, lol), also probably some of my confusion from being so tired, and now I understand it much better and not so confused as I was (I just read too much information, I didn't know whether it applied to what I was doing or not, and this also caused me to be confused with stuff I shouldn't have been confused with).

Sorry, about needing so much help, a lot of this stuff is new to me, so it's been confusing (too much information and not understanding it very well) and difficult for me to get this stuff on my own. Thanks for helping me learn and understand this stuff so much better! Also, being up all day and night working on it and trying to figure it out or understand it, being very tired and sleep derived, doesn't help matters.

----------

some quick questions:

do I need to use the 'local' key-word/command for the labels, and if I do, does this mess up the stack (do I have to account for the displacement of them) ??? I think there's something where if you got looping/nesting/recursion with labels/procedures, you need to use the 'local', else it doesn't work right ???

just want to make sure that this is correct (as I'm not sure if this is how I set it up for the or'ing to work correctly, do I use the same value of 0C0 h or do I need to use different value/binary/bit sequence to properly line it up for the or'ing?): or bl, 0C0h

were you saying that the prototype would just be this: public _RLE_Encode, ???

(also does it matter if I put the key-words/commands on the left side vs the right side, such as with 'public' in the question above, ??? Can I put, using 'public' again as example, after / to the right of, 'proc', or is this right side space, reserved for the 'uses:regs' and/or the 'parameters:data type' )

to go about testing if this works... I presume I'd have to make a (decoded) data file (does this have to be a specific ext or can I just use a *.txt, notepad file?), check the C++ program, figuring out how it works/gets the file (not sure if I can figure out how to get it to be able to find/get/access the file... hmm... I'll have to see if I can figure it out), and then run the C++ program, and seeing if it can link-find-access my asm file, and then if it'll work correctly or not ???

----------------------------

new (corrected) code:

;------------------------------------------------------------------------------
; HEADING
;------------------------------------------------------------------------------

; redacted
; Due: 5:00 pm, Wed., April. 6, 2016

;------------------------------------------------------------------------------
; HISTORY
;------------------------------------------------------------------------------

; Version 1.0

;------------------------------------------------------------------------------
; Credit (those who helped me)
;------------------------------------------------------------------------------

; Online person Jay
; online resources (need to get/paste them here still)
; Colleagues in/from class

;------------------------------------------------------------------------------
; PURPOSE
;------------------------------------------------------------------------------

; The purpose of this program is to encode (compress) a data file.

;------------------------------------------------------------------------------
; MASM BUILD TYPE
;------------------------------------------------------------------------------

.586

;------------------------------------------------------------------------------
; MODEL, STANDARD, and Option TYPES
;------------------------------------------------------------------------------

.MODEL flat, stdcall

option casemap :none ;makes it case sensitive

;------------------------------------------------------------------------------
; LIBRARIES/MODULES
;------------------------------------------------------------------------------

;I had issues with trying to link to the "win32API.asm" file, (pasted it below)

;********************************************************
; Masm Include File for Windows 32-Bit API Functions
;
; The information contained in this file can be found at
; http://msdn.microsoft.com/en-us/library/default.aspx
;
;********************************************************

;********************************************************
; WINDOWS API FUNCTION PROTOTYPES
;********************************************************

ExitProcess PROTO : DWORD
GetStdHandle PROTO : DWORD
ReadConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
SetConsoleCursorPosition PROTO : DWORD, : DWORD
SetConsoleMode PROTO : DWORD, : DWORD
SetConsoleTextAttribute PROTO : DWORD, : DWORD
WriteConsoleA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
FlushConsoleInputBuffer PROTO : DWORD

CreateThread PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
CreateMutexA PROTO : DWORD, : DWORD, : DWORD
ReleaseMutex PROTO :DWORD
Sleep PROTO : DWORD
WaitForSingleObject PROTO :DWORD,:DWORD
WaitForMultipleObjects PROTO :DWORD,:DWORD, :DWORD, :DWORD
SuspendThread PROTO : DWORD
ResumeThread PROTO : DWORD
ExitThread PROTO : DWORD

CreateFileA PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
ReadFile PROTO : DWORD, : DWORD, : DWORD, : DWORD, : DWORD
GetFileSize PROTO : DWORD, : DWORD
CloseHandle PROTO : DWORD

TIMECAPS Struct
wPeriodMin DWORD ?
wPeriodMax DWORD ?
TIMECAPS Ends

timeGetDevCaps PROTO : DWORD, : DWORD
timeBeginPeriod PROTO : DWORD
timeGetTime PROTO

GetTickCount PROTO

QueryPerformanceCounter PROTO : DWORD
QueryPerformanceFrequency PROTO : DWORD
GetLastError PROTO

;********************************************************
; EQUATES
;********************************************************

NULL EQU 0

;*****************************************************
; Standard Handles
;*****************************************************

STD_INPUT_HANDLE EQU -10 ;Standard Input Handle
STD_OUTPUT_HANDLE EQU -11 ;Standard Output Handle
STD_ERROR_HANDLE EQU -12 ;Standard Error Handle


GENERIC_ALL EQU 10000000h
GENERIC_READ EQU 80000000h
GENERIC_WRITE EQU 40000000h
GENERIC_EXECUTE EQU 20000000h

FILE_SHARE_NONE EQU 0
FILE_SHARE_DELETE EQU 4
FILE_SHARE_READ EQU 1
FILE_SHARE_WRITE EQU 2

CREATE_NEW EQU 1
CREATE_ALWAYS EQU 2
OPEN_EXISTING EQU 3
OPEN_ALWAYS EQU 4
TRUNCATE_EXISTING EQU 5


FILE_ATTRIBUTE_NORMAL EQU 80h

;*****************************************************
; Set Console Mode Equates
;
; Refer to Microsoft's documentation on SetConsoleMode
; for a complete description of these equates.
;*****************************************************

ENABLE_NOTHING_INPUT EQU 0000h ;Turn off all input options
ENABLE_ECHO_INPUT EQU 0004h ;Characters read are written to the active screen buffer (can be used with ENABLE_LINE_INPUT)
ENABLE_INSERT_MODE EQU 0020h ;When enabled, text entered in a console window will be inserted at the current cursor location
ENABLE_LINE_INPUT EQU 0002h ;The ReadConsole function returns only when a carriage return character is read.
ENABLE_MOUSE_INPUT EQU 0010h ;If the mouse is within the borders of the console window & the window has the keyboard focus, mouse events are placed in the input buffer. These events are discarded by ReadFile or ReadConsole.
ENABLE_PROCESSED_INPUT EQU 0001h ;CTRL+C is processed by the system and is not placed in the input buffer.
ENABLE_QUICK_EDIT_MODE EQU 0040h ;This flag enables the user to use the mouse to select and edit text. To enable this option, use the OR to combine this flag with ENABLE_EXTENDED_FLAGS.
ENABLE_WINDOW_INPUT EQU 0008h ;User interactions that change the size of the console screen buffer are reported in the console's input buffer.


;If the hConsoleHandle parameter is a screen buffer handle, the mode can be one or more of the following values. When a screen buffer is created, both output modes are enabled by default.
ENABLE_PROCESSED_OUTPUT EQU 0001h ;Characters written by the WriteFile or WriteConsole function or echoed by the ReadFile or ReadConsole function are examined for ASCII control sequences and the correct action is performed.
ENABLE_WRAP_AT_EOL_OUTPUT EQU 0002h ;When writing with WriteFile or WriteConsole or echoing with ReadFile or ReadConsole, the cursor moves to the beginning of the next row when it reaches the end of the current row.


;********************************************************
; CONSOLE FOREGROUND AND BACKGROUND COLOR EQUATES
;********************************************************

FOREGROUND_BLACK EQU 0
FOREGROUND_DARK_BLUE EQU 1
FOREGROUND_DARK_GREEN EQU 2
FOREGROUND_DARK_CYAN EQU 3
FOREGROUND_DARK_RED EQU 4
FOREGROUND_DARK_MAGENTA EQU 5
FOREGROUND_DARK_YELLOW EQU 6
FOREGROUND_GRAY EQU 7
FOREGROUND_DARK_GRAY EQU 8
FOREGROUND_BLUE EQU 9
FOREGROUND_GREEN EQU 10
FOREGROUND_CYAN EQU 11
FOREGROUND_RED EQU 12
FOREGROUND_MAGENTA EQU 13
FOREGROUND_YELLOW EQU 14
FOREGROUND_WHITE EQU 15

BACKGROUND_BLACK EQU FOREGROUND_BLACK * 10h
BACKGROUND_DARK_BLUE EQU FOREGROUND_DARK_BLUE * 10h
BACKGROUND_DARK_GREEN EQU FOREGROUND_DARK_GREEN * 10h
BACKGROUND_DARK_CYAN EQU FOREGROUND_DARK_CYAN * 10h
BACKGROUND_DARK_RED EQU FOREGROUND_DARK_RED * 10h
BACKGROUND_DARK_MAGENTA EQU FOREGROUND_DARK_MAGENTA * 10h
BACKGROUND_DARK_YELLOW EQU FOREGROUND_DARK_YELLOW * 10h
BACKGROUND_GRAY EQU FOREGROUND_GRAY * 10h
BACKGROUND_DARK_GRAY EQU FOREGROUND_DARK_GRAY * 10h
BACKGROUND_BLUE EQU FOREGROUND_BLUE * 10h
BACKGROUND_GREEN EQU FOREGROUND_GREEN * 10h
BACKGROUND_CYAN EQU FOREGROUND_CYAN * 10h
BACKGROUND_RED EQU FOREGROUND_RED * 10h
BACKGROUND_MAGENTA EQU FOREGROUND_MAGENTA * 10h
BACKGROUND_YELLOW EQU FOREGROUND_YELLOW * 10h
BACKGROUND_WHITE EQU FOREGROUND_WHITE * 10h

;------------------------------------------------------------------------------
; Prototypes
;------------------------------------------------------------------------------

PUBLIC _RLE_Encode

;------------------------------------------------------------------------------
; STACK SIZE
;------------------------------------------------------------------------------

.STACK 4096

;------------------------------------------------------------------------------
; RADIX TYPE
;------------------------------------------------------------------------------

; (placeholder)

;------------------------------------------------------------------------------
; DATA SEGMENT (DS)
;------------------------------------------------------------------------------

.DATA

;*********************
; EQUATES/ENUMERATORS
;*********************

; (placeholder)

;***********
; VARIABLES
;***********

; (placeholder)

;------------------------------------------------------------------------------
; CODE SEGMENT (CS)
;------------------------------------------------------------------------------

.CODE

;************
; Procedures
;************

public _RLE_Encode proc

push ebp
mov ebp, esp

push ebx
push ecx
push edx
push esi
push edi

mov esi, [ebp+8]
mov ecx, [ebp+12]
mov edi, [ebp+16]

local Outer_Loop:

mov al, [esi]
mov bl, 01h
inc esi
dec ecx
cmp ecx, 0
je Single_Byte_One

local Inner_Loop:

mov dl, [esi]
cmp al, dl
jne Single_Byte_One

inc esi
inc ebx
dec ecx

cmp bl, 3Fh
je Multiple_Byte

cmp ecx, 0
je Multiple_Byte

jmp Inner_Loop

local Multiple_Byte:

or bl, 0C0h

mov [edi], bl
inc edi

mov [edi], al
inc edi

cmp ecx, 0
je Finish

jmp Outer_Loop

local Single_Byte_Two:

mov byte ptr [edi], 0C1h
inc edi

mov [edi], al
inc edi

cmp ecx, 0
je Finish

jmp Outer_Loop

local Single_Byte_One:

cmp bl, 01h
jne Multiple_Byte

cmp al, 0Ch
jae Single_Byte_Two

mov [edi], al
inc edi

cmp ecx, 0
jne Outer_Loop

local Finish

mov eax, edi
sub eax, [ebp+16]

pop edi
pop esi
pop edx
pop ecx
pop ebx

pop ebp

ret 12

_RLE_Encode endp

HegemonKhan
also one more question:

about the 'stdcall' convention, I think this is the default (if you don't specify one), but I'm not sure if I'm using it or not with regards to this being called through C++, so I'm not sure whether I need to do:

push ebp
mov ebp, esp
mov esi, [ebp+8]
mov ecx, [ebp+12]
mov edi, [ebp+16]

or

push ebp
mov ebp, esp
mov edi, [ebp+16]
mov ecx, [ebp+12]
mov esi, [ebp+8]

(are both of these ways' order, correct? just want to make sure I understand if I'm doing them right)

jaynabonne
First of all, the two code snippets are functionally the same. :) You would need to load a different register from a different offset (reversed).

But if it were stdcall, then you would indeed need it to be:

push ebp
mov ebp, esp
mov esi, [ebp+16]
mov ecx, [ebp+12]
mov edi, [ebp+8]

(I know I haven't answered your other post yet. This was a quick one at work.)

jaynabonne
Also, is this C++ or C?

jaynabonne
As far as a test goes, you don't even need to get as complex as an external file. You can just set up a C array of (unsigned) char with whatever values you want and then pass that in with the size. Just be sure to allocate enough space for the encoded data to go into (the worst possible "compression" would be a doubling of the non-encoded data, so you should probably make your target buffer that big to be sure).

HegemonKhan
C++ program, thankfully, as... I've not yet learned (nor worked with) C yet at all

jaynabonne
Then definitely put

extern "C"

when you declare your function in the cpp file. C++ name mangling is not something you want to have to deal with!

HegemonKhan
It's not able to link, sighs. Something is wrong in my asm file... no idea how/what to change/fix up so it works...

any chance, you got any ideas ???

--------------

got some more comments by my colleagues:

"
So I found out while working on the project that if you don't have any local variables in your procedure, that the assembler/compiler won't set up a stack frame for your procedure. That means that you need to access your parameters in different ways depending on if you use local variables or not. If you use local variables then you access your parameters with [ebp + 4] then adding the size of the datatype. However, if you don't have local variables then you have to access your variables from esp with some weird offsets.

Why is this? If you have parameters shouldn't a stack frame be set up so that you always access your parameters in the same way?
(redacted)
"

"
What an amazing discovery. Well just thinking about it, the reason why this might happen is because the compiler will allocate extra memory in the stack when you allocate for local variables. Because of this, when you call a procedure that has no local variables, ebp does not change at all! So, if you try to access ebp you are really looking at the ebp from the calling environment. Since this is so, esp is the only pointer that is close enough to touch the pushed parameters.

I discovered this just by messing around on visual studio, so it might need fact checking.
(redacted)
"

---------------

the assignment brief does have this:

the function prototype for the RLE procedure is:

dword RLE_Encode (char *InputBuf, dword InputLength, char *OutputBuf)

should I go and put this in, instead of:

public RLE_Encode proto

what about for the actual procedure definition header ???

jaynabonne
When it doesn't link, it should tell you what it can't find. The name it can't find is the name you need your proc to be called. If you could post the actual error message, it would help.

Neither of the comments by colleagues make sense to me, sadly. :) For the first one, local variables are accessed differently, and they don't impact ebp. You had a question before about adjusting esp, and that's how they come into play. So if you needed 20 bytes of local space (say, for 5 vars), then you'd do:

push ebp
mov ebp, esp
sub esp, 20 ; set aside 20 bytes of stack space. You'd have to free it later by adding 20 back to esp (I think).

But that doesn't impact ebp or the way you access incoming parameters. It *does* mean you'd have a different way of accessing the *local*, which you can access either by going positive from esp or negative from ebp. Fortunately, you're not using any of that.

For the second comment, it doesn't make any sense. They'd have to look at the actual generated assembly code to see what it's doing. There are too many variables at play here to know definitely what's going on.

The prototype you have is the one that would be in the C++ file, I think. Except it really should be:

extern "C" DWORD RLE_Encode (char *InputBuf, DWORD InputLength, char *OutputBuf);

Given that you're not supposed to use the built-in inter-language stuff, I don't know why you'd need to put a prototype in the asm file. And you need the "extern "C"" so that it uses the simpler C-style name decoration, not the C++ one. Again, if you can put the link error here, I should be able to tell what's going on, as in what it's looking for.

Either way, I doubt it will just be "RLE_Encode". It could be _RLE_Encode for a cdecl function. It will be worse for others. (The page I linked to before shows how different calling conventions decorate the function names.)

HegemonKhan
hmm.. it just says:

it can't find the file 'RLE_Encode', *
LNK 2019 unresolved external symbol _RLE_Encode referenced in function _main
LNK 1120 1 unresolved externals

*which I have the file named that (though not sure if my C++ is even set up right for it, is there any setting changes that I need to make?) and it's inside of the same VS folder as the c++ source file. If I change to the underscore in front of it, do I need to change the asm file name to using the underscore, do I need to change the 'extern' code line to using the underscore also, do I change the prototype to using the underscore, do I change the procedure definition to using the underscore?

inside the given C++ file they have this:

extern "C" int RLE_Encode (char *, int, char *);

jaynabonne

hmm.. it just says:

it can't find the file 'RLE_Encode', *
LNK 2019 unresolved external symbol _RLE_Encode referenced in function _main
LNK 1120 1 unresolved externals


I don't think the linker says, "it can't find the file 'RLE_Encode', *" (I've never seen a linker with the word "it" before. :) ) Is that something you added or paraphrased? I hope I'm not seeming nit picky, but the exact error(s) you get is important.

It definitely can't find the symbol "_RLE_Encode". So you need to name your proc exactly "_RLE_Encode" (not quote and with leading underscore).

If it also is complaining about not being to find a file, then you might not be giving the right name or path when linking, so it can find your .obj file. From the above, I can't tell if you're getting one error (unresolved symbol) or two (file not found *and* unresolved symbol). If it's the former, then you just hopefully need to name the function correctly. If it's the latter, then it can't find the symbol *because* it can't find the file, which means you need to specify how it can find the file correctly as well.

HegemonKhan
ah, so it is dependant also upon me setting up VS correctly too... I wasn't able to figure this out yet... hopefully my code works, and it's just not having the VS set up to be able to find/link my asm file, but maybe there's still an issue with my asm file code too.

I matched it's name correctly (underscore), and yes, that was me paraphrasing, though it does say "file not found" (error message upon trying to run the C++ file), so that part is not paraphrasing, but my use of "it", was definately part of my paraphrasing, lol.

jaynabonne
Can you give the exact error message? It might have a clue.

This topic is now closed. Topics are closed after 14 days of inactivity.

Support

Forums