本文介绍了理解冲突合并算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看着一个看起来全都搞砸了的合并标记.为了给你这种情况,让我们有这个:

I look at a merge marker that looked all screwed up. To give you the situation lets have this:

public void methodA() {
    prepare();
    try {
      doSomething();
    }
    catch(Exception e) {
      doSomethingElse();
    }
}

现在进行合并(我使用 SourceTree 进行拉取).标记看起来像这样:

Now comes in a merge (I use SourceTree for pull).And the marker looks like this:

<<<<<<<<< HEAD
    try {
      doSomething();
    }
    catch(Exception e) {
      doSomethingElse();
    }
============================
private void methodB() {
    doOtherStuff();
>>>>>>>> 9832432984384398949873ab
}

所以拉取提交的作用是完全删除methodA并添加methodB.

So what the pulled commit does is removing the methodA completely and adding methodB instead.

但是您注意到有些行完全丢失了.

But you notice that there are some lines entirely missing.

根据我对这个过程的理解,Git 正在尝试所谓的自动合并,如果失败并检测到冲突,则完整的合并由标记为 '<<<* HEAD' + 之前的部分表示+ '====' + after + '>>>* CommitID' 并准备手动解决冲突.

From what I understand of the process, Git is trying a so called auto-merge and if this fails and conflicts where detected, the complete merge is expressed by parts marked with '<<<* HEAD' + before + '====' + after + '>>>* CommitID' and prepare a manual conflict resolution.

那么为什么它会遗漏一些行.对我来说,它看起来更像是一个错误.

So why does it leave out some lines. It looks more like a bug to me.

我使用的是 Windows7,安装的 git 版本是 2.6.2.windows.1.虽然最新版本是 2.9,但我想知道是否有任何关于具有这种规模合并问题的 git 版本的信息?这不是我第一次经历这样的事情.......

I use Windows7 and the installed git version is 2.6.2.windows.1. While the newest version is 2.9, I wonder if anything is known about a git version having a merge problem of this magnitude? This is not the first time I experienced something like this... .

推荐答案

您的担心是正确的:Git 对语言一无所知,其内置的合并算法严格基于逐行比较.您不必使用这种内置的合并算法,但大多数人都会这样做,因为 (a) 它主要是有效的,并且 (b) 没有太多替代方案.

You are correct to be concerned: Git knows nothing of languages, and its built-in merge algorithm is based strictly on line-at-time comparisons. You do not have to use this built-in merge algorithm, but most people do because (a) it mostly just works, and (b) there are not that many alternatives.

请注意,这取决于您的合并策略(-s 参数);下面的文字是针对默认的 recursive 策略的.resolve 策略与 recursive 非常相似;octopus 策略适用于不止两次提交;而 ours 策略完全不同(与 -X ours 完全不同).您还可以使用 .gitattributes 和合并驱动程序"为特定文件选择替代策略或算法.而且,这些都不适用于 Git 决定相信是二进制"的文件:对于这些文件,它甚至不会尝试合并.(我不打算在这里介绍任何内容,只介绍默认的 recursive 策略如何处理文件.)

Note that this depends on your merge strategy (-s argument); the text below is for the default recursive strategy. The resolve strategy is pretty similar to recursive; the octopus strategy applies to more than just two commits; and the ours strategy is entirely different (and is nothing like -X ours). You can also select alternative strategies or algorithms for specific files, using .gitattributes and "merge drivers". And, none of this applies to files that Git has decided to believe are "binary": for these, it does not even attempt merging. (I am not going to cover any of that here, just how the default recursive strategy treats files.)

  • 合并从两个提交开始:当前提交(也称为我们的"、本地"和 HEAD)和一些其他"提交(也称为他们的"和远程"))
  • Merge 在这些提交之间找到合并基础
    • 通常这只是另一个提交:在隐含分支加入的第一个点的提交
    • 在某些特殊情况下(多个合并基候选),Git 必须发明一个虚拟合并基"(但我们将在此处忽略这些情况)
    • Merge starts with two commits: the current one (also called "ours", "local", and HEAD), and some "other" one (also called "theirs" and "remote")
    • Merge finds the merge base between these commits
      • Normally that's just one other commit: the one at the first point where the implied branches join up
      • In some special cases (multiple merge base candidates), Git must invent a "virtual merge base" (but we'll ignore these cases here)
      • 这些已开启重命名检测
      • 您可以自己运行这些相同的差异以查看合并会看到什么

      您可以将这两个差异视为我们做了什么"和他们做了什么".合并的目标结合我们做了什么"和他们做了什么".差异是基于行的,来自最小的编辑距离算法,,实际上只是 Git 对我们做了什么以及他们做了什么的猜测.

      You can think of these two diffs as "what we did" and "what they did". The goal of a merge is to combine "what we did" and "what they did". The diffs are line based, come from a minimal edit distance algorithm, and are really just Git's guess about what we did, and what they did.

      first diff (base-vs-local) 的输出告诉 Git 哪些基础文件对应于哪些本地文件,即如何将名称从当前提交返回到基础文件.Git 然后可以使用基本名称来发现另一个提交中的重命名或删除.在大多数情况下,我们可以忽略重命名和删除问题,以及新文件创建问题.请注意,Git 2.9 版默认为所有 差异启用重命名检测,而不仅仅是合并差异.(您可以在早期的 Git 版本中通过将 diff.renames 配置为 true 来自己打开它;另请参阅 git config 设置diff.renameLimit.)

      The output of the first diff (base-vs-local) tells Git which base files correspond to which local files, i.e., how to follow names from the current commit back to the base. Git can then use the base names to spot renames or deletes in the other commit as well. For the most part we can just ignore rename and delete issues, and also new-file-creation issues. Note that Git version 2.9 turns on rename detection by default for all diffs, not just merge diffs. (You can turn this on yourself in earlier Git versions by configuring diff.renames to true; see also the git config setting for diff.renameLimit.)

      如果文件仅在一侧(基础到本地或基础到其他)发生更改,Git 只会接受这些更改.Git 只需在双方更改文件时进行三向合并.

      If a file is changed on only one side (base-to-local, or base-to-other), Git simply takes those changes. Git only has to do a three-way merge when a file is changed on both sides.

      为了执行三向合并,Git 基本上遍历了两个差异(base-to-local 和 base-to-other),一次一个差异大块",比较改变了地区.如果每个大块影响原始基本文件的不同部分,Git 只需要那个大块.如果某些大块头影响了基本文件的相同部分,Git 会尝试获取任何更改的副本.

      To perform a three-way merge, Git essentially walks through the two diffs (base-to-local and base-to-other), one "diff hunk" at a time, comparing the changed regions. If each hunk affects a different part of the original base file, Git just takes that hunk. If some hunk(s) affect the same part of the base file, Git tries to take one copy of whatever that change is.

      例如,如果本地更改说添加右大括号行",而远程更改说添加(相同位置,相同缩进)右大括号行",则 Git 将仅获取右大括号的一份副本.如果两者都说删除右括号行",Git 只会删除该行一次.

      For instance, if the local change says "add a close brace line" and the remote change says "add (the same place, same indentation) close brace line", Git will take just one copy of the close brace. If both say "delete a close brace line" Git will just delete the line once.

      只有当两个差异冲突——例如,一个说添加一个缩进 12 个空格的右大括号"而另一个说添加一个缩进 11 个空格的右大括号",Git 才会声明冲突.默认情况下,Git 将冲突写入文件,显示两组更改——并且,如果您将 merge.conflictstyle 设置为 diff3 显示来自文件合并基础版本的代码.

      Only if the two diffs conflict—e.g., one says "add a close brace line indented 12 spaces" and the other says "add a close brace line indented 11 spaces" will Git declare a conflict. By default, Git writes the conflict into the file, showing the two sets of changes—and, if you set merge.conflictstyle to diff3, also showing the code from the merge-base version of the file.

      任何不冲突的差异块,Git 都适用.如果存在冲突,Git 通常会将文件置于冲突合并"状态.但是,两个 -X 参数(-X ours-X theirs)修改了这个:with -X ours Git 在冲突中选择我们的"差异块,并将该更改放入,忽略他们的"更改.使用 -X theirs Git 选择他们的"差异块并将更改放入,忽略我们的"更改.这两个 -X 参数保证 Git 毕竟不会声明冲突.

      Any non-conflicting diff hunks, Git applies. If there were conflicts, Git normally leaves the file in "conflicted merge" state. However, the two -X arguments (-X ours and -X theirs) modify this: with -X ours Git chooses "our" diff hunk in the conflict, and puts that change in, ignoring "their" change. With -X theirs Git chooses "their" diff hunk and puts that change in, ignoring "our" change. These two -X arguments guarantee that Git does not declare a conflict after all.

      如果 Git 能够自行解决此文件的所有问题,它会这样做:您将在工作树和索引/暂存区中获得基本文件,加上您的本地更改,以及其他更改.

      If Git is able to resolve everything on its own for this file, it does so: you get the base file, plus your local changes, plus their other changes, in the work-tree and in the index/staging-area.

      如果 Git 不能自己解决所有问题,它会使用三个特殊的非零索引槽将文件的基本版本、其他版本和本地版本放入索引/暂存区.工作树版本始终是Git 能够解决的问题,加上各种可配置项所指示的冲突标记."

      If Git is not able to resolve everything on its own, it puts the base, other, and local versions of the file into the index/staging-area, using the three special nonzero index slots. The work-tree version is always "what Git was able to resolve, plus the conflict markers as directed by various configurable items."

      诸如 foo.java 之类的文件通常会暂存在槽 0 中.这意味着它现在已准备好进行新的提交.根据定义,其他三个插槽是空的,因为有一个插槽零条目.

      A file such as foo.java is normally staged in slot zero. This means it is ready to go into a new commit now. The other three slots are empty, by definition, because there is a slot-zero entry.

      在冲突合并期间,槽 0 留空,槽 1-3 用于保存合并基础版本、本地"或 --ours 版本,另一个或 --他们的 版本.工作树保存正在进行的合并.

      During a conflicted merge, slot zero is left empty, and slots 1-3 are used to hold the merge base version, the "local" or --ours version, and the other or --theirs version. The work-tree holds the in-progress merge.

      您可以使用 git checkout 提取这些版本中的任何一个,或使用 git checkout -m 重新创建合并冲突.所有成功的 git checkout 命令都会更新文件的工作树版本.

      You can use git checkout to extract any of these versions, or git checkout -m to re-create the merge conflict. All successful git checkout commands update the work-tree version of the file.

      一些 git checkout 命令使各种插槽不受干扰.一些 git checkout 命令写入插槽 0,清除插槽 1-3 中的条目,以便文件准备好提交.(要知道哪些是做什么的,你只需要记住它们.我把它们弄错了,在我的脑海里,有一段时间了.)

      Some git checkout commands leave the various slots undisturbed. Some git checkout commands write into slot 0, wiping out the entries in slots 1-3, so that the file is ready for commit. (To know which ones do what, you just have to memorize them. I had them wrong, in my head, for quite a while.)

      在清除所有未合并的插槽之前,您无法运行 git commit.您可以使用 git ls-files --unmerged 查看未合并的插槽,或使用 git status 以获得更人性化的版本.(提示:使用git status.经常使用它!)

      You cannot run git commit until all unmerged slots have been cleared out. You can use git ls-files --unmerged to view unmerged slots, or git status for a more human-friendly version. (Hint: use git status. Use it often!)

      即使 git merge 成功自动合并所有东西,也不代表结果是正确的!当然,当它因为冲突而停止时,这也意味着Git 无法自动合并所有内容,并不是说它自己自动合并的内容是正确的.我喜欢将 merge.conflictstyle 设置为 diff3 以便我可以看到 Git 在替换基本"代码之前认为 base 是什么与两侧合并.冲突的发生通常是因为差异选择了错误的基数(例如某些匹配的大括号和/或空行),而不是因为必须存在实际冲突.

      Even if git merge successfully auto-merges everything, that does not mean the result is correct! Of course, when it stops with a conflict, this also means that Git was not able to auto-merge everything, not that what it has auto-merged on its own is correct. I like to set merge.conflictstyle to diff3 so that I can see what Git thought the base was, before it replaced that "base" code with the two sides of the merge. Often a conflict happens because the diff chose the wrong base—such as some matching braces and/or blank lines—rather than because there had to be an actual conflict.

      使用耐心"差异可能会导致基础选择不佳,至少在理论上如此.我自己没有尝试过这个.Git 2.9 中新的压缩启发式" 是很有希望,但我也没有尝试过.

      Using the "patience" diff can held with poor base choice, at least in theory. I have not experimented with this myself. The new "compaction heuristic" in Git 2.9 is promising, but I have not experimented with this either.

      您必须始终检查和/或测试合并的结果.如果合并已提交,您可以编辑文件、构建和测试,git add 更正的版本,并使用 git commit --amend 将之前的(不正确的)合并提交推开,并使用相同的父级提交不同的提交.(git commit --amend--amend 部分是虚假广告.它不会改变当前提交本身,因为它不能; 相反,它使用与当前提交相同的父 ID 进行新提交,而不是使用当前提交的 ID 作为新提交的父 ID 的正常方法.)

      You must always inspect and/or test the results of a merge. If the merge is already committed, you can edit files, build and test, git add the corrected versions, and use git commit --amend to shove the previous (incorrect) merge commit out of the way and put in a different commit with the same parents. (The --amend part of git commit --amend is false advertising. It does not change the current commit itself, because it can not; instead, it makes a new commit with the same parent IDs as the current commit, instead of the normal method of using the current commit's ID as the new commit's parent.)

      您还可以使用 --no-commit 抑制合并的自动提交.在实践中,我发现很少需要这样做:大多数合并大部分都可以正常工作,并且快速查看 git show -m 和/或它编译并通过单元测试"会发现问题.但是,在发生冲突或 --no-commit 合并期间,一个简单的 git diff 将为您提供组合差异(与使用 git show 没有 -m,在您提交合并之后),这可能会有所帮助,或者可能更令人困惑.您可以运行更具体的 git diff 命令和/或检查三个(基本、本地、其他)插槽条目,如 格雷格在评论中指出.

      You can also suppress the auto-commit of a merge with --no-commit. In practice, I have found little need for this: most merges mostly just work, and a quick eyeballing of git show -m and/or "it compiles and passes unit tests" catches problems. However, during a conflicted or --no-commit merge, a simple git diff will give you a combined diff (the same sort you get with git show without -m, after you commit the merge), which can be helpful, or may be more confusing. You can run more-specific git diff commands and/or inspect the three (base, local, other) slot entries, as Gregg noted in a comment.

      除了使用 diff3 作为您的 merge.conflictstyle 之外,您还可以看到 git merge 将看到的差异.您需要做的就是运行两个 git diff 命令——与 git merge 将运行的两个命令相同.

      Besides using diff3 as your merge.conflictstyle, you can see the diffs that git merge will see. All you need to do is run two git diff commands—the same two that git merge will run.

      要做到这些,你必须找到——或者至少告诉 git diff 去寻找——merge base.您可以使用 git merge-base,它可以从字面上找到(或所有)合并基并将它们打印出来:

      To do these, you must find—or at least, tell git diff to find—the merge base. You can use git merge-base, which literally finds the (or all) merge base(s) and prints them out:

      $ git merge-base --all HEAD foo
      4fb3b9e0570d2fb875a24a037e39bdb2df6c1114
      

      这表示在当前分支和分支foo之间,合并基础是commit 4fb3b9e...(并且只有一个这样的合并基础).然后我可以运行 git diff 4fb3b9e HEADgit diff 4fb3b9e foo.但是有一种更简单的方法,只要我可以假设只有一个合并基础:

      This says that between the current branch and branch foo, the merge base is commit 4fb3b9e... (and there is only one such merge base). I can then run git diff 4fb3b9e HEAD and git diff 4fb3b9e foo. But there is an easier way, as long as I can assume that there is only the one merge base:

      $ git diff foo...HEAD   # note: three dots
      

      这告诉 git diff(和 only git diff)找到 fooHEAD,然后将该提交(即合并基础)与提交 HEAD 进行比较.并且:

      This tells git diff (and only git diff) to find the merge base between foo and HEAD, and then compare that commit—that merge base—to commit HEAD. And:

      $ git diff HEAD...foo   # again, three dots
      

      做同样的事情,找到 HEADfoo 之间的合并基数——合并基数"是可交换的,所以这些应该与其他方式相同,比如7+2 和 2+7 都是 9——但是这次将合并基础与提交 foo.

      does the same thing, find the merge base between HEAD and foo—"merge base" is commutative so these should be the same as the other way around, like 7+2 and 2+7 are both 9—but this time diff the merge base against commit foo.

      (对于其他命令——不是 git diff 的东西——三点语法产生了一个对称差异:在任一分支上的所有提交的集合,但不是在两个分支上.对于具有单个合并基础提交的分支,这是每个分支在合并基础之后的每个提交":换句话说,两个分支的并集,不包括合并基本身和任何较早的提交.对于具有多个合并基的分支,这会减去所有合并基.对于 git diff 我们只是假设只有一个合并基,而不是减去它和它的祖先,我们将它用作差异的左侧或之前".)

      (For other commands—things that are not git diff—the three-dot syntax produces a symmetric difference: the set of all commits that are on either branch, but not on both branches. For branches with a single merge base commit, this is "every commit after the merge base, on each branch": in other words, the union of the two branches, excluding the merge base itself and any earlier commits. For branches with multiple merge bases, this subtracts away all the merge bases. For git diff we just assume there's only the one merge base, and instead of subtracting it and its ancestors away, we use it as the left or "before" side of the diff.)

      在 Git 中,分支 name 标识一个特定的提交,即分支的 tip.事实上,这就是分支的实际工作方式:一个分支名称命名一个特定的提交,并且为了向该分支添加另一个提交——branch 这里的意思是提交链— Git 进行一个新提交,其父项是当前分支提示,然后将分支名称指向新提交.分支"一词可以指分支名称,也可以指整个提交链;我们应该根据上下文找出哪一个.

      In Git, a branch name identifies one particular commit, namely the tip of the branch. In fact, this is how branches actually work: a branch name names a specific commit, and in order to add another commit to the branch—branch here meaning the chain of commits—Git makes a new commit whose parent is the current branch-tip, then points the branch name at the new commit. The word "branch" can refer to either the branch name, or the entire chain of commits; we are supposed to figure out which one by context.

      在任何时候,我们都可以命名一个特定的提交,并将其视为一个分支,通过获取该提交及其所有祖先:其父项、其父项的父项等等.在此过程中,当我们遇到合并提交(具有两个或多个父项的提交)时,我们将所有父项提交及其父项的父项,依此类推.

      At any time, we can name one specific commit, and treat that as a branch, by taking that commit and all its ancestors: its parent, its parent's parent, and so on. When we hit a merge commit—a commit with two or more parents—in this process, we take all the parent commits, and their parents' parents, and so on.

      这个算法其实是可选择的.默认的 myers 基于 Eugene Myers,但 Git 有其他一些选择.

      This algorithm is actually selectable. The default myers is based on an algorithm by Eugene Myers, but Git has a few other options.

      这篇关于理解冲突合并算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-20 12:02