本文介绍了Intel x86 手册中是否存在诸如直接/间接寻址模式之类的术语的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

为了提供一点背景知识,我想研究 x86 指令是如何手动编码/解码的.我遇到了 ModR/M 和 SIB 字节,似乎理解 x86 寻址模式是理解指令编码方案的基础.

To give a little bit of background, I wanted to study how x86 instructions are encoded/decoded manually. I came across the ModR/M and SIB bytes and it seems that understanding the x86 addressing modes is fundamental to understanding the instruction encoding scheme.

因此,我在 Google 上搜索了 x86 寻址模式.搜索返回的大多数博客/视频都是 8086 处理器的寻址模式.通过其中的一些,不同的寻址模式是注册、直接、间接、索引、基于,等等.但是博客在提及这些寻址模式时使用了不一致的名称.多个不同的源使用多种不同的寻址模式.英特尔手册中甚至没有提到不同的术语 此处.例如,我在 Intel 手册中似乎找不到任何地方,一种称为 Direct 或 Indirect 的寻址模式.此外,ModRM 字节中的 Mod 位是一个 2 位字段,这让我想知道是否有 4 种以上的寻址模式是可能的.

Hence, I did a Google search for x86 addressing modes. Most blogs/videos that the search returned were addressing modes for the 8086 processor. Going through some of them, the different addressing modes were Register, Direct, Indirect, Indexed, Based, and some more. But the blogs use inconsistent names when referring to these addressing modes. Multiple different sources use multiple different addressing modes. The different terms are not even mentioned in the Intel manual here. For example, I can't seem to find anywhere in the Intel manual, an addressing mode called Direct or Indirect. Also, the Mod bits in the ModRM byte is a 2 bit field, which makes me wonder if more than 4 addressing modes are possible.

我的问题是,像直接寻址模式、间接寻址模式这样的术语是否在英特尔手册中不再使用,而是由公众使用的旧术语.如果技术上确实存在这些术语,我可以在手册中的何处找到对它们的引用.

My question is, are terms like Direct addressing modes, Indirect addressing modes older terms that are no longer used in the Intel manuals, but used by the general public. If the terms technically do exists, where can I find a reference to them in the manuals.

推荐答案

对于大多数形式的 x86 寻址模式并没有真正的正式名称.它们都具有 [base + index*scale + disp8/disp32] 形式(或其中任何 1 或 2 个组件的子集),但 64 位 RIP 相对寻址除外.引用内存位置的内容.(x86 寻址模式).

There aren't really official names for most forms of x86 addressing modes. They all have the form [base + index*scale + disp8/disp32] (or a subset of any 1 or 2 components of that), except for 64-bit RIP-relative addressing. Referencing the contents of a memory location. (x86 addressing modes).

英特尔在第 1 卷的第 3.7.5 节中对寻址模式的这些组件进行了正式命名.它们也使用寄存器、立即数和内存,但通常对不同形式的寻址模式没有太大影响.

Intel does officially name those components of addressing modes, in section 3.7.5 of volume 1. They also use Register vs. Immediate vs. Memory, but usually don't make a big deal about different forms of addressing mode.

ModRM 字节中的 Mod 位是一个 2 位字段,这让我想知道是否可以使用 4 种以上的寻址模式.

Mod 使用 disp0/8/32 选择寄存器与内存.更多模式有转义"码

Mod chooses Register vs. Memory with disp0/8/32. There are "escape" codes for more modes

  • 没有位移的 [rbp] 模式意味着有一个没有基础的 disp32.(这就是你在反汇编中看到 [rbp+0] 的原因:[rbp] 的最佳编码是 base=rbp,disp8 为 0.(注意 [rbp]code>[rbp] 当它是一个帧指针时没有用.)
  • base=rsp 的 ModR/M 编码意味着有一个 SIB 字节.
  • 索引=RSP 的 SIB 编码意味着没有索引.(根据之前的规则,这使得编码 [rsp] 成为可能,而不是不太有用的 [rsp+rsp].)
  • The modes that would be [rbp] with no displacement instead means there's a disp32 with no base. (This is why you see [rbp+0] in disassembly: the best encoding for [rbp] is base=rbp, with a disp8 of 0. (Note that [rbp] isn't useful when it's a frame pointer.)
  • The ModR/M encodings that would be base=rsp mean there's a SIB byte.
  • The SIB encodings that would be index=RSP mean no index. (Given the previous rule, this makes it possible to encode [rsp], instead of the less-useful [rsp+rsp].)

在用英文写汇编语言时,很自然地会使用含义明显的术语,包括您提到的一些术语.例如,英特尔的优化手册说(我的重点):

When writing in English about assembly language, it's natural to use terms with obvious meanings, including some that you mentioned. For example, Intel's optimization manual says (my emphasis):

2.3.2.4微操作队列和循环流检测器 (LSD)

...(在 SnB 上的 IDQ 中未层压具有索引寻址模式的微融合 uops)

... (micro-fused uops with indexed addressing modes are un-laminated in the IDQ on SnB)

...对于由索引寻址支配的代码(通常发生在数组处理中),使用基(或基+位移)寻址的重新编码算法可以有时通过保持加载加操作和存储指令融合来提高性能.

... For code that is dominated by indexed addressing (as often happens with array processing), recoding algorithms to use base (or base+displacement) addressing can sometimes improve performance by keeping the load plus operation and store instructions fused.

索引寻址模式包括使用 idx*scale 的任何组合,无论它是使用 base reg 还是 disp32,或两者兼而有之.(idx 本身是不可编码的;[rax*1] 实际上被编码为 disp32+idx*1 并且 disp32=0.)在某些时候,他们会说任何带有索引的寻址模式"或类似内容,否则可能不清楚他们的确切含义.当然,用性能计数器测试可以验证解释.

Indexed addressing modes include any combination that uses idx*scale, regardless of whether it's with a base reg or with a disp32, or both. (idx alone is not encodeable; [rax*1] is actually encoded as disp32+idx*1 with disp32=0.) At some point they say "any addressing mode with an index" or similar, otherwise it might not be clear exactly what they meant. Of course, testing with performance counters can verify the interpretation.

但他们不会为事物编造名称而做得过火.如果没有明显的英语短语,他们可以坚持某事,他们会写(仍在 Sandybridge 部分):

But they don't over-do it with making up names for things. When there isn't an obvious English phrase they can stick on something, they write (still in the Sandybridge section):

常见的加载延迟是五个周期.使用简单寻址时模式,基数加上小于2048的偏移,加载延迟可以是四个周期.

在表 2-19 中,它们有两列,一列用于 Base + Offset >2048;Base + Index [+ Offset],另一个为Base + Offset 延迟低 1 个周期(256b AVX 负载除外).(有趣的事实,[rdi+8] 的延迟比 [rdi-8] 低 1c.)

In table 2-19, they have two columns, one for Base + Offset > 2048; orBase + Index [+ Offset], and another for Base + Offset < 2048 with latencies 1 cycle lower (except for 256b AVX loads). (Fun fact, [rdi+8] is 1c lower latency than [rdi-8].)

(从技术上讲,他们可能应该说位移",因为整个寻址模式在 x86 术语中形成偏移量或有效地址,当添加到段基址时形成线性地址.但偏移量"也用于用非 x86 通用术语描述寻址模式的直接常量部分.幸运的是,x86 分段现在不是您通常需要考虑的事情.)

(Technically they probably should have said "displacement", because the whole addressing mode forms an offset or effective-address in x86 terminology, which forms a linear address when added to the segment base. But "offset" is also used to describe immediate constant parts of addressing modes in non-x86 generic terminology. And x86 segmentation is fortunately not something you usually have to think about these days.)

在 vol.1 手册中,英特尔确实使用了您描述的一些术语.他们将仅具有位移分量的寻址模式描述为直接"(排序),将 [reg] 描述为间接",因为在谈论指令集和类型时确实会使用这些术语它们支持的寻址模式.

In the vol.1 manual, Intel does sort of use some of the terminology you describe. They describe an addressing mode with just a displacement component as "direct" (sort of), and [reg] as "indirect", because those terms do get used when talking about instruction-sets and what kind of addressing modes they support.

vol.1 3.7.5 指定偏移量

以下寻址模式建议用于常见组合地址组件.

The following addressing modes suggest uses for common combinations of address components.

  • 位移 ⎯ 一个位移就代表一个直接(未计算的)对操作数的偏移.因为位移是在指令中编码,这种形式的地址有时是称为绝对地址或静态地址.它通常用于访问一个静态分配的标量操作数.

  • Displacement ⎯ A displacement alone represents a direct (uncomputed) offset to the operand. Because the displacement is encoded in the instruction, this form of an address is sometimes called an absolute or static address. It is commonly used to access a statically allocated scalar operand.

基数 ⎯ 基数单独表示对操作数的间接偏移....

Base ⎯ A base alone represents an indirect offset to the operand. ...

但正如您所见,它们不会为更复杂的形式组成名称.

But as you saw, they don't make up names for the more complex forms.

不过,它们确实区分了立即数、寄存器与内存操作数.(3.7 操作数寻址).不过,它们通常很少或根本没有区分使用寄存器编码的 r/m32 操作数与必须是寄存器的其他操作数.

They do distinguish between Immediate vs. Register vs. Memory operands, though. (3.7 OPERAND ADDRESSING). They usually make little or no distinction between an r/m32 operand that uses a register encoding, vs. the other operand that has to be a register, though.

直接与间接也适用于分支.这有点像谈论到达将要运行的代码字节的寻址模式.

Direct vs. indirect also comes up for branches. It's a bit like talking about the addressing mode for reaching the code bytes that will be run next.

6.3.7 64 位模式下的分支函数

...

地址大小影响用于 JCXZ 和 LOOP 的 RCX 大小;它们还会影响内存的地址计算间接 分支.此类地址默认为 64 位;但它们可以被地址大小覆盖为 32 位前缀.

Address sizes affect the size of RCX used for JCXZ and LOOP; they also impact the address calculation for memory indirect branches. Such addresses are 64 bits by default; but they can be overridden to 32 bits by an address size prefix.

内存间接是 jmp [rax],其中 RIP 的最终值来自内存,而不是像 jmp rax 这样设置 RIP=RAX 的寄存器间接分支.x86 没有用于加载/存储的内存间接寻址模式;采取分支后的代码提取在术语中引入了额外的间接级别.(有点).

Memory indirect is jmp [rax], where the final value of RIP comes from memory, vs. a register-indirect branch like jmp rax that sets RIP=RAX. x86 doesn't have a memory-indirect addressing mode for loads/stores; code-fetch after a branch is taken introduces the extra level of indirection in the terminology. (sort of).

第 2 卷 jmp 的手动输入确实谈到间接与相对或绝对跳跃.(尽管请注意 x86 没有绝对的直接 near 跳转(为此将地址放入寄存器中),只有使用立即指针指定的绝对远地址(ptr16:16ptr16:32) 或间接使用内存位置.)

The vol.2 manual entry for jmp does talk about indirect vs. relative or absolute jumps. (Although note that x86 doesn't have absolute direct near jumps (put an address in a register for that), only absolute far address specified with an immediate pointer (ptr16:16 or ptr16:32) or indirectly with a memory location.)

在描述近间接跳转时,jmp r/m32(或 64),他们说在 GP reg 或内存中间接指定的绝对偏移量".(绝对偏移量"是相对于 CS 段基数的).

When describing near indirect jumps, jmp r/m32 (or 64), they say "absolute offset specified indirectly in a GP reg or memory". ("absolute offset" is relative to the CS segment base).

分段使 x86 寻址变得更加复杂,尤其是在比较可以显式包含段的特殊寻址模式与不包含段的特殊寻址模式时.

Segmentation makes x86 addressing more complicated to talk about, especially when comparing special addressing modes that can include a segment explicitly vs. ones that don't.

根据一般情况的子集记住 x86 寻址模式可以做什么要容易得多,而不是用索引、基于或其他名称单独记住所有不同的可能性.

It's far easier to remember what x86 addressing modes can do in terms of subsets of the general case, rather than memorizing all the different possibilities separately with names like Indexed, Based, or whatever.

您在诸如 https://www.tutorialspoint.com/microprocessor 之类的教程中看到了这种情况/microprocessor_8086_addressing_modes.htmhttp://www.geeksforgeeks.org/addressing-modes/ 对寻址模式进行分类很重要.后者甚至有一个测验,要求您将 C 语句与一些寻址模式名称相匹配.

You see that kind of thing in tutorials like https://www.tutorialspoint.com/microprocessor/microprocessor_8086_addressing_modes.htm or http://www.geeksforgeeks.org/addressing-modes/ that make a big deal out of classifying the addressing modes. The latter even has a quiz asking you to match C statements with some addressing-mode names.

对于不太灵活的 16 位寻址模式,您可以尝试命名它们的数量很少,而且基于与索引确实为您提供了不同的寄存器选择.但是当您在编程时,您真正需要记住的是,您可以选择 [bx|bp] + [di|si] + disp0/8/16 的任何子集.这就是 di/si(dst/src 索引)和 bx/bp 的名字.

With the less-flexible 16-bit addressing modes, there are few enough that you can try to name them, and Based vs. Indexed does actually give you a different choice of registers. But when you're programming, all you really need to remember is that it's your choice of any subset of [bx|bp] + [di|si] + disp0/8/16. This is how di/si (dst/src index) and maybe bx/bp got their names.

此类术语可用于比较不同 ISA 的功能.例如,维基百科说像 PDP-8 这样的旧 ISA 使用了大量内存- 间接的,因为它们的寄存器很少,而且寄存器的寻址范围只有 8 位.

Terminology like this can be useful in comparing the capabilities of different ISAs. For example, Wikipedia says that old ISAs like PDP-8 made a lot of use of memory-indirect because they had few registers and only 8 bit addressing range with registers.

维基百科还说:

请注意,没有普遍接受的命名各种寻址模式的方式.

在模式命名上做大事是没有意义的.如果你正在写一些东西,确保你的意思很清楚,而不依赖于某些术语的特定技术含义.例如如果您说索引寻址模式",请确保读者从上下文中知道您是否包含 base+index*scale.

There's no sense making a big deal out of naming of modes. If you're writing something, make sure it's clear what you mean without depending on a specific technical meaning for certain terms. e.g. if you say "an index addressing mode", make sure the reader knows from context whether you're including base+index*scale or not.

我想知道命名模式的一些愿望是否起源于 8086 之前的 8 位微型计算机.您可能想在 https 上询问这个问题://retrocomputing.stackexchange.com/.我对 8 位 CPU 上可用的寻址模式知之甚少,主要是固定的一字节指令.

I wonder if some of the desire to name modes originated with 8-bit micros that predate 8086. You might want to ask about this over on https://retrocomputing.stackexchange.com/. I don't know much about addressing modes available on 8-bit CPUs with mostly fixed one-byte instructions.

这篇关于Intel x86 手册中是否存在诸如直接/间接寻址模式之类的术语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-07 03:12