answered as: Why didn't Intel introduce the original 8086 as a 32-bit CPU, and had they done so, would there have been a 640k limit and real mode vs protected mode?
Before I begin my answer, I need to be exquisitely clear, while I currently work for Intel, these are my opinions and I was not at Intel at the times discussed.
As Petr, correctly points out the first 32-bit processor was the Intel iAPX 432 - Wikipedia, which was in 1981. This was to be the original follow on after the 80xx series from Intel and used a modern capability-based system and was programming in a high-level language. It is a very cool chip, however, it was slow and expensive. But I’m a little ahead of myself.
I’ll not repeat the first part of Tom’s excellent answer which tracts the history of the 8080/8085/8086/8088 processors. Unfortunately, he left out a little about Motorola’s history which explains some of the Intel’s later behavior/design choices to counter the Motorola products in the market place.
As Tom points out the 8-bit register size and 16-bit address space were what the Intel 8080/8085, the Zilog Z80, MOS 6501/6502 and (its sort of sibling) the Motorola MC6800 all used (I’m going to ignore the Moto vs. MOS Tech story for this answer).
The important point is that Intel had a regular cadence and the ‘20-bit’ address space of the 8086/8088 made logical sense for them to follow the ‘16-bit’ addressing of the chips that they and their competitors had all made previously.
What is interesting was Motorola’s strategy at that time (the mid to late 1970s). Compared to the Intel and Zilog chipsets, the MC6800 and MOS6502 were considered a lot easier to both program and to interface hardware too. The other thing to remember is that at the time as these microprocessors are being manufactured and sold, the minicomputers such as DEC’s PDP-11 and the DG Nova were 16-bit systems, although the minis could support an alternate MMU that allowed larger address spaces such as 18/22/24 bits for the PDP-11.
From an ISA standpoint, the Motorola marketing team chose to stay at a mix of 8-bits & 16-bit registers but limited to 16-bits of address in their 1978 processor the MC6809, which was Moto’s follow on to the MC6800. It turns out there was an unofficial, so-called ‘skunk-works’ processor project going on in Austin in the back of one of the Motorola development labs, which had started about a year or so after the official (fully funded) MC6809 work had begun.
This underground project proposed to build a new 16-bit processor as it was designed by a bunch of ex-Schlumberger folks (Les Crudele, Tom Grunner, Nick Trudennick). Remember the logic family and process technology, i.e. the number of gates, etc. was much less than what we have today, but the team thought they could create something like the PDP-11 on a single chip (they had been PDP-11 hackers at Schlumberger and in grad school). This was extremely risky and a lot people did not think that the current technology could support it. Thus, this was not an official project and mostly ignored or not even known about at the highest levels of Moto management.
According to my discussions with Les, the team did not even have a budget for things like getting a mask made. The one mask that did get made was snuck through the system. But what they did do was built a TTL prototype, Nick started to write microcode and Les and friends started to lay the gates in the Motorola process of the time. When they got the TTL system working and the logic guys thought they had a chip, they got one chance to make a mask to prove their point.
The other important part of the story is as they were doing this work, DEC had a lawsuit they would win against Cal-Data for cloning the PDP-11 and the Unibus. So with this experimental architecture, the Moto engineers were very careful to have a different bus and not be exactly like the DEC PDP-11 (A registers and D registers as an example).
As I said, they got one shot at making a chip and it actually worked pretty well with what Intel calls the ‘A-step.’ They made a few of the chips and those parts were given an X-series number and handed to some of their closest partners, such as IBM, Tektronix, HP to play with. My lab at Tektronix was the receiving end and I have written about this elsewhere - the results would become the Magnolia machine.
Famously, when the IBM team from Florida came to Motorola looking for a processor for what would become the PC, Moto marketing brought them the official device, the MC6809. IBM had explained they wanted a 16-bit processor to match the mini’s of the day and Moto famously said — here is the 6809 it can do 16-bit arithmetics. IBM said, “yeah but …. We know about the X68000, we have some of them in Yorktown in Research,” and Moto said - “No, that’s not a product, that’s an experiment.”
The rest is history as they say.
The interesting thing is that between the IBM/PC and other design wins, Intel started to have good success with the 8086 and Motorola was losing market share with the MC6809. So Moto’s management made the late choice to turn the experiment into a committed project, which in the end, would become their most successful processor family (although it was too late for the PC).
Tom mentioned in his answer, that this experimental chip was a 32-bit processor. It actually is not. The natural size of the C ‘int’ data type for the original 68000 is a 16-bits (more in a minute). As I explained in my answer: Clem Cole's answer to What does 32bit and 64bit refer to?
the basic idea is that there are two primary functional units that determine the fundamental CPU size of data functional units and the address units. The computer has a memory which today is broken up into 8-bit bytes.
The processor collects those bytes together into ‘chunks’ and manipulates the bytes inside the processor itself in what are called ‘arithmetic functional units.’ These are the adders, multipliers with the most important of these is what is called a Barrel shifter - Wikipedia.
the old Motorola 68000 which was the processor in the original Apple Mac was a 16-bit chip using a 16-bit shifter and arithmetic unit, although all the registers were defined as 32 bits and the external memory bus was 24 bits. Only later, with the 68020 did Motorola make a full 32-bit device.
BTW: another important thing to note about the original chips from all of the vendors in those days, was the lack of an MMU built into the chips. The original 68000 had a separate and optional PDP-11 style base/limit register chip [there is a similarly famous story about Apple and that chip for Lisa and the Mac, but it is not relevant to the answer to the Intel-based question].
So now we can come back the Intel and its follow on to the 8086/8088. As Tom explained, IBM had picked the 8086 device family for the IBM PC. As a follow on to that processor, Intel countered the new Motorola part, with two more, the 80186 and 80286. Besides a larger physical address space, and new built-in MMU was added. They also added new hardware that Tom referred too and the OP asked, real and protected modes, which allowed the code for the earlier 8086/8088 to continue to execute, user/kernel-mode and the like.
As Tom pointed out, both the MC68000 and the 80286 had 24-bits of external physical addressing. And both chips were internally 16-bit processors (i.e. the barrel shifter and basic functional units operated 16-bits at a time).
However, the two teams took two different paths that would have huge differences from a software standpoint. While the processor did everything 16-bits at a time, the Motorola team created linear 32-bit registers, whereas Intel used a trick called ‘segmentation,’ which divided all registers into two pieces, and the high ‘segment’ register and the lower 16-bit “offset” part. This breaks the memory into small 16-bit chunks, called segments which can overlap each other. It turns out that for some programming languages of the day, such as Algol, Pascal, and Fortran, segmentation was not a big deal because they tended to think of data as ‘words’ and could support the chunk idea fairly easily. But other languages, such as C, really preferred a linear byte addressed memory space (in truth, as my compiler friends all tell me it is actually easier for the other languages too, but some of the larger 36-bit systems like the large GE and DEC PDP-10 were also word-oriented so tricks like the one Intel used were not unnatural for compiler writers of the day).
But that single difference in the way memory and data is treated (segmentation vs. linear), had a huge difference in practice, particularly as UNIX and C started to become popular.
The 32-bit, C long data type was a natural data type for the MC68000, although it took two clock ticks to support a 32-bit operation since it was a 16-bit chip inside. The other design choice that Les and Nick did that was important, was a lesson that they had learned from the IBM 360 architecture. The MC68000 only had 24-bits of physical address, but like the IBM architecture, the Moto designers also stored/maintained the upper 8 bits as though they had 32-bits of address space in all the microcode. This meant that like the IBM 360 family, when later chips were produced which (what IBM called models) with more external addresses were designed which actually used those upper 8 bits, the software built for the earlier chips would continue to work unchanged. The result for the programmer is a smooth and linear address space that is not broken into pieces.
At the time, many of thought that Intel had engineered themselves into a corner with the 808x ISA, and there was little they could do about it. But the Intel engineers pulled a remarkable move and in 1985 produced a chip that, like its predecessors, was still segmented, but it supported 40 bits of physical address, which a 32-bit base for the offset. I always laughed because, if you look in the old Intel 386 system software book when it was released, the authors still talked about how great segments were; meanwhile, all the software designers just ran the new 386 chip as a 32-bit linear system as everyone else had done with the Motorola 68000 family.
Finally, Tom made a comment, saying that if the first 80x86 chip had been supplied 32-bit support from the start, it probably would have had memory protection. This is where I disagree a bit with him. The 80286 had an MMU, full protection, et al, so I agree that the Intel design team saw the need for integral MMU before Motorola did, but the trick was not 32-bits, but supporting linear addressing, which was what the MC68000 had that the Intel devices lacked.
As a small postscript, when Motorola created the 68020, they made a full 32-bit chip internally (and a C “int” data type could naturally become a 32-bit value, operating in a single clock tick), but unlike Intel, that device still needed an external MMU chip (Moto would not integrate the MMU on-chip until the 68030). But the point was because the original Motorola device was linear in the addressing, all new chips that followed the 68000 ‘just worked’ from a software standpoint.