• Timsort 是一种稳定高效的排序算法,混合了合并排序和二分插入排序,旨在很好地处理多种真实数据。它由Tim Peters于2002年实施使用在Python编程语言中。该算法查找已经排序的数据的子序列,并使用该知识更有效地对其余部分进行排序。从版本2.3开始,Timsort一直是Python的标准排序算法。如今,Timsort 已是是 Python、 Java、 Android平台 和 GNU Octave 的默认排序算法。
  • 大致思路是:因为二分插入排序处理短序列的效率比较高,所以就把待排序的长序列切分成多个短序列,每个短序列独立执行二分插入排序,最后再对排好序的短序列做归并排序。在归并排序时,通过检测序列中已经排好序的片段,巧妙地把这些片段直接作为排序结果,跳过归并操作,从而减少了归并排序时的迭代次数。
  • 复杂度比较(借用网上的图,原图地址):

  • jdk/lib/src.zip/java.base/java/util/TimSort.java
package java.util;

/**
* A stable, adaptive, iterative mergesort that requires far fewer than
* n lg(n) comparisons when running on partially sorted arrays, while
* offering performance comparable to a traditional mergesort when run
* on random arrays. Like all proper mergesorts, this sort is stable and
* runs O(n log n) time (worst case). In the worst case, this sort requires
* temporary storage space for n/2 object references; in the best case,
* it requires only a small constant amount of space.
*
* This implementation was adapted from Tim Peters's list sort for
* Python, which is described in detail here:
*
* http://svn.python.org/projects/python/trunk/Objects/listsort.txt
*
* Tim's C code may be found here:
*
* http://svn.python.org/projects/python/trunk/Objects/listobject.c
*
* The underlying techniques are described in this paper (and may have
* even earlier origins):
*
* "Optimistic Sorting and Information Theoretic Complexity"
* Peter McIlroy
* SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms),
* pp 467-474, Austin, Texas, 25-27 January 1993.
*
* While the API to this class consists solely of static methods, it is
* (privately) instantiable; a TimSort instance holds the state of an ongoing
* sort, assuming the input array is large enough to warrant the full-blown
* TimSort. Small arrays are sorted in place, using a binary insertion sort.
*
* @author Josh Bloch
*/
class TimSort<T> {
/**
* This is the minimum sized sequence that will be merged. Shorter
* sequences will be lengthened by calling binarySort. If the entire
* array is less than this length, no merges will be performed.
*
* This constant should be a power of two. It was 64 in Tim Peter's C
* implementation, but 32 was empirically determined to work better in
* this implementation. In the unlikely event that you set this constant
* to be a number that's not a power of two, you'll need to change the
* {@link #minRunLength} computation.
*
* If you decrease this constant, you must change the stackLen
* computation in the TimSort constructor, or you risk an
* ArrayOutOfBounds exception. See listsort.txt for a discussion
* of the minimum stack length required as a function of the length
* of the array being sorted and the minimum merge sequence length.
*/
private static final int MIN_MERGE = 32;

/**
* The array being sorted.
*/
private final T[] a;

/**
* The comparator for this sort.
* 比较器
*/
private final Comparator<? super T> c;

/**
* When we get into galloping mode, we stay there until both runs win less
* often than MIN_GALLOP consecutive times.
*/
private static final int MIN_GALLOP = 7;

/**
* This controls when we get *into* galloping mode. It is initialized
* to MIN_GALLOP. The mergeLo and mergeHi methods nudge it higher for
* random data, and lower for highly structured data.
*/
private int minGallop = MIN_GALLOP;

/**
* Maximum initial size of tmp array, which is used for merging. The array
* can grow to accommodate demand.
* 归并排序中临时数组的初始长度
*
* Unlike Tim's original C version, we do not allocate this much storage
* when sorting smaller arrays. This change was required for performance.
*/
private static final int INITIAL_TMP_STORAGE_LENGTH = 256;

/**
* Temp storage for merges. A workspace array may optionally be
* provided in constructor, and if so will be used as long as it
* is big enough.
* 归并排序的临时数组,tmpBase是数组中有效数据的起始下标,tmpLen是有效数据的长度
*/
private T[] tmp;
private int tmpBase; // base of tmp array slice
private int tmpLen; // length of tmp array slice

/**
* A stack of pending runs yet to be merged. Run i starts at
* address base[i] and extends for len[i] elements. It's always
* true (so long as the indices are in bounds) that:
*
* runBase[i] + runLen[i] == runBase[i + 1]
*
* so we could cut the storage for this, but it's a minor amount,
* and keeping all the info explicit simplifies the code.
*/
private int stackSize = 0; // Number of pending runs on stack
private final int[] runBase;
private final int[] runLen;

/**
* Creates a TimSort instance to maintain the state of an ongoing sort.
* 构造函数是私有的,单例模式
*
* @param a the array to be sorted
* @param c the comparator to determine the order of the sort
* @param work a workspace array (slice)
* @param workBase origin of usable space in work array
* @param workLen usable size of work array
*/
private TimSort(T[] a, Comparator<? super T> c, T[] work, int workBase, int workLen) {
this.a = a;
this.c = c;

// Allocate temp storage (which may be increased later if necessary)
// 确定临时数组的初始长度,如果低于默认值256的2倍,则空间大小为原始数组a的长度的一半,否则为默认长度256
int len = a.length;
int tlen = (len < 2 * INITIAL_TMP_STORAGE_LENGTH) ?
len >>> 1 : INITIAL_TMP_STORAGE_LENGTH;
// 排序需要的临时数组长度是tlen,如果给定的work数组无法容纳tlen长度的数据,就创建新的临时数组,否则就直接把work数组当临时数组用
// work数组貌似没什么用,只有在ArraysParallelSortHelpers.java中会设置work,绝大多数的类调用TimSort时work参数都是null,所以没必要看了
if (work == null || workLen < tlen || workBase + tlen > work.length) {
// 屏蔽泛型数组转型产生的警告
@SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
T[] newArray = (T[])java.lang.reflect.Array.newInstance
(a.getClass().getComponentType(), tlen);
tmp = newArray;
tmpBase = 0;
tmpLen = tlen;
}
else {
tmp = work;
tmpBase = workBase;
tmpLen = workLen;
}

/*
* Allocate runs-to-be-merged stack (which cannot be expanded). The
* stack length requirements are described in listsort.txt. The C
* version always uses the same stack length (85), but this was
* measured to be too expensive when sorting "mid-sized" arrays (e.g.,
* 100 elements) in Java. Therefore, we use smaller (but sufficiently
* large) stack lengths for smaller arrays. The "magic numbers" in the
* computation below must be changed if MIN_MERGE is decreased. See
* the MIN_MERGE declaration above for more information.
* The maximum value of 49 allows for an array up to length
* Integer.MAX_VALUE-4, if array is filled by the worst case stack size
* increasing scenario. More explanations are given in section 4 of:
* http://envisage-project.eu/wp-content/uploads/2015/02/sorting.pdf

* 存储run的栈,不能在运行时动态扩展。一个长数组会被切分成多个短数组分别排序,run存储的就是每个子数组的长度和在原数组中的起始下标,在最后归并的时候用来定位子数组
*/
int stackLen = (len < 120 ? 5 :
len < 1542 ? 10 :
len < 119151 ? 24 : 49);
runBase = new int[stackLen];
runLen = new int[stackLen];
}

/*
* The next method (package private and static) constitutes the
* entire API of this class.
* sort方法是TimSort类对外暴露的唯一方法,其余都是私有方法
*/

/**
* Sorts the given range, using the given workspace array slice
* for temp storage when possible. This method is designed to be
* invoked from public methods (in class Arrays) after performing
* any necessary array bounds checks and expanding parameters into
* the required forms.
*
* 对数组a中的[lo, hi)片段做排序
* @param a the array to be sorted
* @param lo the index of the first element, inclusive, to be sorted
* @param hi the index of the last element, exclusive, to be sorted
* @param c the comparator to use
* @param work a workspace array (slice)
* @param workBase origin of usable space in work array
* @param workLen usable size of work array
* @since 1.8
*/
static <T> void sort(T[] a, int lo, int hi, Comparator<? super T> c,
T[] work, int workBase, int workLen) {
assert c != null && a != null && lo >= 0 && lo <= hi && hi <= a.length;
// 待排序的元素个数
int nRemaining = hi - lo;
// 待排序元素只有0个或1个,不需要排序
if (nRemaining < 2)
return; // Arrays of size 0 and 1 are always sorted

// If array is small, do a "mini-TimSort" with no merges
// 如果待排序元素少于归并排序的最小元素数量,就执行不包含归并排序的版本
if (nRemaining < MIN_MERGE) {
// 获取从a[lo]开始的最长有序连续片段的长度,如果片段是递减的会被反转成递增,得到的initRunLen就是从a[lo]开始已经排好序的长度
int initRunLen = countRunAndMakeAscending(a, lo, hi, c);
// 调用二分插入排序,对a[lo + initRunLen]到a[hi]的所有元素执行插入
binarySort(a, lo, hi, lo + initRunLen, c);
return;
}

/**
* March over the array once, left to right, finding natural runs,
* extending short natural runs to minRun elements, and merging runs
* to maintain stack invariant.
*/
// 创建TimSort实例,保存中间结果
TimSort<T> ts = new TimSort<>(a, c, work, workBase, workLen);
// 切分子数组的最小长度
int minRun = minRunLength(nRemaining);
do {
// Identify next run
// a[lo]开始已经排好序的长度
int runLen = countRunAndMakeAscending(a, lo, hi, c);

// If run is short, extend to min(minRun, nRemaining)
// 切分子数组,如果剩余长度不足最小长度,就只排序剩余长度的子数组。
// 为什么minRun会被解释成minimum length呢?当nRemaining <= minRun时,子数组长度就是nRemaining,说明minRun不是长度的下限,而是上限。
if (runLen < minRun) {
int force = nRemaining <= minRun ? nRemaining : minRun;
// 对切分出的a[lo, lo+force]片段执行二分插入排序,从a[lo+runLen]元素开始执行插入
binarySort(a, lo, lo + force, lo + runLen, c);
runLen = force;
}

// Push run onto pending-run stack, and maybe merge
// 记录当前子数组的run信息
ts.pushRun(lo, runLen);
// 每生成一个run后,尝试做归并
ts.mergeCollapse();

// Advance to find next run
// 把lo后移到下个待切分的子数组头部
lo += runLen;
nRemaining -= runLen;
} while (nRemaining != 0);

// Merge all remaining runs to complete sort
// 所有切分都完成后,lo一定等于hi
assert lo == hi;
// 做最终的归并
ts.mergeForceCollapse();
// 归并完成后只有一个run
assert ts.stackSize == 1;
}

/**
* Sorts the specified portion of the specified array using a binary
* insertion sort. This is the best method for sorting small numbers
* of elements. It requires O(n log n) compares, but O(n^2) data
* movement (worst case).
* 二分插入排序,对于小片段排序是最优算法
* 在调用binarySort前已经调用过countRunAndMakeAscending,确保[lo, start)的片段已经排好序了
*
* If the initial part of the specified range is already sorted,
* this method can take advantage of it: the method assumes that the
* elements from index {@code lo}, inclusive, to {@code start},
* exclusive are already sorted.
*
* @param a the array in which a range is to be sorted
* @param lo the index of the first element in the range to be sorted
* @param hi the index after the last element in the range to be sorted
* @param start the index of the first element in the range that is
* not already known to be sorted ({@code lo <= start <= hi})
* @param c comparator to used for the sort
*/
// 屏蔽与switch陈述式中遗漏break相关的警告
@SuppressWarnings("fallthrough")
private static <T> void binarySort(T[] a, int lo, int hi, int start,
Comparator<? super T> c) {
// 感觉这种assert毫无意义
assert lo <= start && start <= hi;
if (start == lo)
start++;
//从start开始,逐个进行插入,start是递增的
for ( ; start < hi; start++) {
// pivot是当前待插入的元素
T pivot = a[start];

// Set left (and right) to the index where a[start] (pivot) belongs
// a[lo, start)片段是排好序的,a[start]是待排序的,所以每一轮的插入实际就是把a[start]插入到a[lo, start)中,插入的结果就是a[lo, start]是排好序的,因为start是循环变量,所以最后一轮插入完成后,排好序的a[lo, start]就是a[lo, hi]
// 在调用binarySort前确定a[lo, start)排好序,就省掉了a[lo+1]到a[start-1]这些元素的插入操作,对于有很多有序片段的数组排序效率很高
int left = lo;
int right = start;
assert left <= right;
/*
* Invariants:
* pivot >= all in [lo, left).
* pivot < all in [right, start).
*/
// 二分查找确定插入位置
while (left < right) {
int mid = (left + right) >>> 1;
if (c.compare(pivot, a[mid]) < 0)
right = mid;
else
left = mid + 1;
}
assert left == right;

/*
* The invariants still hold: pivot >= all in [lo, left) and
* pivot < all in [left, start), so pivot belongs at left. Note
* that if there are elements equal to pivot, left points to the
* first slot after them -- that's why this sort is stable.
* Slide elements over to make room for pivot.
*/
// 插入位置是left,所以包含a[left]在内的后面所有元素都要后移一位
int n = start - left; // The number of elements to move
// Switch is just an optimization for arraycopy in default case
// 一旦case匹配,会无条件执行后面所有case直到遇到break,所以case2和case1要么都执行要么都不执行
switch (n) {
// 要移动的元素小于等于2个,直接用赋值方式移动
case 2: a[left + 2] = a[left + 1];
case 1: a[left + 1] = a[left];
break;
// 要移动的元素大于2个,用arraycopy把a[left, start-1]写到a[left+1, start],直接复制内存
default: System.arraycopy(a, left, a, left + 1, n);
}
// 插入pivot
a[left] = pivot;
}
}

/**
* Returns the length of the run beginning at the specified position in
* the specified array and reverses the run if it is descending (ensuring
* that the run will always be ascending when the method returns).
*
* A run is the longest ascending sequence with:
*
* a[lo] <= a[lo + 1] <= a[lo + 2] <= ...
*
* or the longest descending sequence with:
*
* a[lo] > a[lo + 1] > a[lo + 2] > ...
*
* For its intended use in a stable mergesort, the strictness of the
* definition of "descending" is needed so that the call can safely
* reverse a descending sequence without violating stability.
*
* @param a the array in which a run is to be counted and possibly reversed
* @param lo index of the first element in the run
* @param hi index after the last element that may be contained in the run.
It is required that {@code lo < hi}.
* @param c the comparator to used for the sort
* @return the length of the run beginning at the specified position in
* the specified array

* 返回从a[lo]开始的最长有序连续片段的长度
*/
private static <T> int countRunAndMakeAscending(T[] a, int lo, int hi,
Comparator<? super T> c) {
assert lo < hi;
// hi-lo<2 的情况在调用countRunAndMakeAscending前已经判断过了,这里没必要
int runHi = lo + 1;
if (runHi == hi)
return 1;

// Find end of run, and reverse range if descending
if (c.compare(a[runHi++], a[lo]) < 0) { // Descending
while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) < 0)
runHi++;
// 默认的排序结果是递增,所以如果从a[lo]开始是递减的,要把最长递减片段反转
// 因为runHi++了,所以反转的范围是[lo, runHi)
reverseRange(a, lo, runHi);
} else { // Ascending
while (runHi < hi && c.compare(a[runHi], a[runHi - 1]) >= 0)
runHi++;
}

return runHi - lo;
}

/**
* Reverse the specified range of the specified array.
* 反转数组a中[lo, hi)的片段,因为调用reverseRange前hi多++了一次,所以反转不包含a[hi]
* @param a the array in which a range is to be reversed
* @param lo the index of the first element in the range to be reversed
* @param hi the index after the last element in the range to be reversed
*/
private static void reverseRange(Object[] a, int lo, int hi) {
hi--;
// 反转过程就是逐层交换两端的值
while (lo < hi) {
Object t = a[lo];
a[lo++] = a[hi];
a[hi--] = t;
}
}

/**
* Returns the minimum acceptable run length for an array of the specified
* length. Natural runs shorter than this will be extended with
* {@link #binarySort}.

* 根据待排序的元素数量,返回参与归并的子数组的最小长度,也就是执行二分插入排序的子数组的最小长度。因为二分插入排序适合小片段排序,所以对于长的数组,要拆分成多个子数组,分别执行二分插入排序,最后再归并
* 调用minRunLength前已经判断过nRemaining < MIN_MERGE的情况,所以这里的n一定是大于等于MIN_MERGE的。如果n是2的幂次,返回的就是MIN_MERGE/2,否则返回的是[MIN_MERGE/2, MIN_MERGE]内的一个值
* Roughly speaking, the computation is:
*
* If n < MIN_MERGE, return n (it's too small to bother with fancy stuff).
* Else if n is an exact power of 2, return MIN_MERGE/2.
* Else return an int k, MIN_MERGE/2 <= k <= MIN_MERGE, such that n/k
* is close to, but strictly less than, an exact power of 2.
*
* For the rationale, see listsort.txt.
*
* @param n the length of the array to be sorted
* @return the length of the minimum run to be merged
*/
private static int minRunLength(int n) {
assert n >= 0;
int r = 0; // Becomes 1 if any 1 bits are shifted off
while (n >= MIN_MERGE) {
r |= (n & 1);
n >>= 1;
}
return n + r;
}

/**
* Pushes the specified run onto the pending-run stack.
*
* @param runBase index of the first element in the run
* @param runLen the number of elements in the run
* stack不能扩容,所以这里会有越界的风险
*/
private void pushRun(int runBase, int runLen) {
this.runBase[stackSize] = runBase;
this.runLen[stackSize] = runLen;
stackSize++;
}

/**
* Examines the stack of runs waiting to be merged and merges adjacent runs
* until the stack invariants are reestablished:
*
* 1. runLen[i - 3] > runLen[i - 2] + runLen[i - 1]
* 2. runLen[i - 2] > runLen[i - 1]
*
* This method is called each time a new run is pushed onto the stack,
* so the invariants are guaranteed to hold for i < stackSize upon
* entry to the method.

* 每次生成一个新的run后,都要尝试做一次归并,让stack中的run序列满足上面两个式子,大概意思就是保证越靠前的run长度越长。因为归并排序的最优情况就是先归并短序列,再归并长序列,对于这里的stack,最终的归并是从后往前按顺序归并的,如果run的分布长短交错,就会经常遇到一个长序列和一个短序列做归并,效率非常低,所以每生成一个run后都要尝试做归并,尽量把相邻的短序列提早归并掉,来保证最终做归并时待归并的序列越来越长。
* 因为每次插入新的run后,stackSize最后都加一,所以最后一个run的下标是stackSize-1,从后往前归并时先归并下标是stackSize-2和stackSize-1的两个run,也就是调用mergeAt(stackSize-2)
*/
private void mergeCollapse() {
while (stackSize > 1) {
int n = stackSize - 2;
if (n > 0 && runLen[n-1] <= runLen[n] + runLen[n+1]) {
if (runLen[n - 1] < runLen[n + 1])
n--;
mergeAt(n);
} else if (runLen[n] <= runLen[n + 1]) {
mergeAt(n);
} else {
break; // Invariant is established
}
}
}

/**
* Merges all runs on the stack until only one remains. This method is
* called once, to complete the sort.
* 最终的归并,在stack中从后往前按顺序归并所有的run
*/
private void mergeForceCollapse() {
while (stackSize > 1) {
int n = stackSize - 2;
if (n > 0 && runLen[n - 1] < runLen[n + 1])
n--;
mergeAt(n);
}
}

/**
* Merges the two runs at stack indices i and i+1. Run i must be
* the penultimate or antepenultimate run on the stack. In other words,
* i must be equal to stackSize-2 or stackSize-3.

* 归并下标是i和i+1的两个run,要保证i是stackSize-2或stackSize-3,也就是保证归并的两个run在stack的尾部,在调用mergeCollapse时可能会跳过最后一个run,归并倒数第二个和第三个run,所以i也可以是stackSize-3。因为每次归并完stackSize都会减小,所以i和stackSize的关系会始终如此。
* 归并的结果就是两个子数组原地排好序,同时在stack中两个run合并成一个
* @param i stack index of the first of the two runs to merge
*/
private void mergeAt(int i) {
assert stackSize >= 2;
assert i >= 0;
assert i == stackSize - 2 || i == stackSize - 3;

int base1 = runBase[i];
int len1 = runLen[i];
int base2 = runBase[i + 1];
int len2 = runLen[i + 1];
// 为什么要加一堆assert,debug忘删了?
assert len1 > 0 && len2 > 0;
assert base1 + len1 == base2;

/*
* Record the length of the combined runs; if i is the 3rd-last
* run now, also slide over the last run (which isn't involved
* in this merge). The current run (i+1) goes away in any case.
*
* 归并之前,先把runBase、runLen、stackSize更新到归并后的结果,反正当前的值已经保存在临时变量中了
*/
runLen[i] = len1 + len2;
if (i == stackSize - 3) {
runBase[i + 1] = runBase[i + 2];
runLen[i + 1] = runLen[i + 2];
}
stackSize--;

/*
* Find where the first element of run2 goes in run1. Prior elements
* in run1 can be ignored (because they're already in place).
*/
// 查找run2的第一个元素a[base2]在run1中的插入位置,有相同元素时插入位置靠后
int k = gallopRight(a[base2], a, base1, len1, 0, c);
assert k >= 0;
// run2的第一个元素插入到run1的第k个位置,所以归并后a[base1, base1+k)一定在开头,不需要参与归并的过程,直接略过
// 之所以有相同元素时插入位置靠后,就是保证略过的元素尽量多,提高效率
base1 += k;
len1 -= k;
// 如果略过了整个run1,说明排序的结果就是整个run2都在run1的后面,已经排好了,不需要再执行归并
if (len1 == 0)
return;

/*
* Find where the last element of run1 goes in run2. Subsequent elements
* in run2 can be ignored (because they're already in place).
*/
// 查找run1的最后一个元素a[base1 + len1 - 1]在run2中的插入位置,有相同元素时插入位置靠前
len2 = gallopLeft(a[base1 + len1 - 1], a, base2, len2, len2 - 1, c);
// run1的最后一个元素插入到run2的第len2个位置,所以归并后a(base2+len2, base2+oldlen2]一定在末尾,不需要参与归并的过程,直接略过
// 之所以有相同元素时插入位置靠前,也是保证略过的元素尽量多
assert len2 >= 0;
if (len2 == 0)
return;

// Merge remaining runs, using tmp array with min(len1, len2) elements
// run1和run2挨着,要想原地排序就一定要把其中一个run放入临时数组,空出一些位置来存放排好序的元素,否则就需要各种复杂的原地交换操作。
// 为了节省空间,选择把run1和run2中较短的一个放入临时数组。如果run1放入临时数组,空出的位置就在前面,所以mergeLo的归并顺序就是从base1往后按从小到大的顺序存放归并的元素。如果run2放入临时数组,空出的位置就在后面,所以mergeHi的归并顺序就是从base2+len2往前按从大到小的顺序存放归并的元素
if (len1 <= len2)
mergeLo(base1, len1, base2, len2);
else
mergeHi(base1, len1, base2, len2);
}

/**
* Locates the position at which to insert the specified key into the
* specified sorted range; if the range contains an element equal to key,
* returns the index of the leftmost equal element.
* 把key插入到base和len对应的run中,返回插入位置。
* 如果遇到与key相同的元素,就会插到相同元素的前面,因为判定条件中左侧是严格大于,右侧是小于等于
* hint是开始查找的位置,是子数组从0开始的下标,0 <= hint < n,所以是从a[base + hint]开始比较
* 返回值是子数组从0开始的下标,而不是原数组中的下标
* 为什么不直接二分查找呢?效率会差很多?
*
* @param key the key whose insertion point to search for
* @param a the array in which to search
* @param base the index of the first element in the range
* @param len the length of the range; must be > 0
* @param hint the index at which to begin the search, 0 <= hint < n.
* The closer hint is to the result, the faster this method will run.
* @param c the comparator used to order the range, and to search
* @return the int k, 0 <= k <= n such that a[b + k - 1] < key <= a[b + k],
* pretending that a[b - 1] is minus infinity and a[b + n] is infinity.
* In other words, key belongs at index b + k; or in other words,
* the first k elements of a should precede key, and the last n - k
* should follow it.
*/
private static <T> int gallopLeft(T key, T[] a, int base, int len, int hint,
Comparator<? super T> c) {
assert len > 0 && hint >= 0 && hint < len;
int lastOfs = 0;
int ofs = 1;
// key > a[base + hint]
if (c.compare(key, a[base + hint]) > 0) {
// Gallop right until a[base+hint+lastOfs] < key <= a[base+hint+ofs]
int maxOfs = len - hint;
// 从a[base+hint+1]开始向右遍历,直到 key <= a[base+hint+ofs],lastOfs是上一轮迭代的ofs,所以就有a[base+hint+lastOfs] < key <= a[base+hint+ofs]
while (ofs < maxOfs && c.compare(key, a[base + hint + ofs]) > 0) {
lastOfs = ofs;
// 偏移ofs是按照1、3、7、15这样指数级增长的,因为a[base, base+len]已经是有序的了,所以不怕步子太大漏掉
ofs = (ofs << 1) + 1;
// 如果ofs整型溢出了,就直接设为maxOfs,把右边界拉到a[base+hint+len]
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;

// Make offsets relative to base
// ofs是从hint开始算的偏移量,这里加上hint以后,ofs就变成子数组的下标了,后面就不需要hint了
lastOfs += hint;
ofs += hint;
} else { // key <= a[base + hint]
// Gallop left until a[base+hint-ofs] < key <= a[base+hint-lastOfs]
final int maxOfs = hint + 1;
// 从a[base+hint-1]开始向左遍历,直到 key > a[base+hint-ofs],lastOfs是上一轮迭代的ofs,所以就有a[base+hint-ofs] < key <= a[base+hint-lastOfs]
while (ofs < maxOfs && c.compare(key, a[base + hint - ofs]) <= 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;

// Make offsets relative to base
// 因为是向左遍历,lastOfs应该在ofs的右边,这里为了和上面的结果保持一致,交换了lastOfs和ofs,让lastOfs在ofs的左边
int tmp = lastOfs;
lastOfs = hint - ofs;
ofs = hint - tmp;
}
// lastOfs不可能是负数吧?
assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;

/*
* Now a[base+lastOfs] < key <= a[base+ofs], so key belongs somewhere
* to the right of lastOfs but no farther right than ofs. Do a binary
* search, with invariant a[base + lastOfs - 1] < key <= a[base + ofs].
*
* 到此已经确定了a[base+lastOfs] < key <= a[base+ofs],在这个范围里进行二分查找确定key的插入位置
*/
lastOfs++;
while (lastOfs < ofs) {
int m = lastOfs + ((ofs - lastOfs) >>> 1);

if (c.compare(key, a[base + m]) > 0)
lastOfs = m + 1; // a[base + m] < key
else
ofs = m; // key <= a[base + m]
}
assert lastOfs == ofs; // so a[base + ofs - 1] < key <= a[base + ofs]
return ofs;
}

/**
* Like gallopLeft, except that if the range contains an element equal to
* key, gallopRight returns the index after the rightmost equal element.
*
* 和gallopLeft差不多,区别是如果遇到与key相同的元素,就会插到相同元素的后面,因为判定条件中左侧是大于等于,右侧是严格小于
* @param key the key whose insertion point to search for
* @param a the array in which to search
* @param base the index of the first element in the range
* @param len the length of the range; must be > 0
* @param hint the index at which to begin the search, 0 <= hint < n.
* The closer hint is to the result, the faster this method will run.
* @param c the comparator used to order the range, and to search
* @return the int k, 0 <= k <= n such that a[b + k - 1] <= key < a[b + k]
*/
private static <T> int gallopRight(T key, T[] a, int base, int len,
int hint, Comparator<? super T> c) {
assert len > 0 && hint >= 0 && hint < len;

int ofs = 1;
int lastOfs = 0;
if (c.compare(key, a[base + hint]) < 0) {
// Gallop left until a[b+hint - ofs] <= key < a[b+hint - lastOfs]
int maxOfs = hint + 1;
while (ofs < maxOfs && c.compare(key, a[base + hint - ofs]) < 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;

// Make offsets relative to b
int tmp = lastOfs;
lastOfs = hint - ofs;
ofs = hint - tmp;
} else { // a[b + hint] <= key
// Gallop right until a[b+hint + lastOfs] <= key < a[b+hint + ofs]
int maxOfs = len - hint;
while (ofs < maxOfs && c.compare(key, a[base + hint + ofs]) >= 0) {
lastOfs = ofs;
ofs = (ofs << 1) + 1;
if (ofs <= 0) // int overflow
ofs = maxOfs;
}
if (ofs > maxOfs)
ofs = maxOfs;

// Make offsets relative to b
lastOfs += hint;
ofs += hint;
}
assert -1 <= lastOfs && lastOfs < ofs && ofs <= len;

/*
* Now a[b + lastOfs] <= key < a[b + ofs], so key belongs somewhere to
* the right of lastOfs but no farther right than ofs. Do a binary
* search, with invariant a[b + lastOfs - 1] <= key < a[b + ofs].
*/
lastOfs++;
while (lastOfs < ofs) {
int m = lastOfs + ((ofs - lastOfs) >>> 1);

if (c.compare(key, a[base + m]) < 0)
ofs = m; // key < a[b + m]
else
lastOfs = m + 1; // a[b + m] <= key
}
assert lastOfs == ofs; // so a[b + ofs - 1] <= key < a[b + ofs]
return ofs;
}

/**
* Merges two adjacent runs in place, in a stable fashion. The first
* element of the first run must be greater than the first element of the
* second run (a[base1] > a[base2]), and the last element of the first run
* (a[base1 + len1-1]) must be greater than all elements of the second run.
*
* For performance, this method should be called only when len1 <= len2;
* its twin, mergeHi should be called if len1 >= len2. (Either method
* may be called if len1 == len2.)
*
* @param base1 index of first element in first run to be merged
* @param len1 length of first run to be merged (must be > 0)
* @param base2 index of first element in second run to be merged
* (must be aBase + aLen)
* @param len2 length of second run to be merged (must be > 0)
*/
private void mergeLo(int base1, int len1, int base2, int len2) {
assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

// Copy first run into temp array
T[] a = this.a; // For performance
// 检查临时数组的长度,必要时扩容
T[] tmp = ensureCapacity(len1);
int cursor1 = tmpBase; // Indexes into tmp array
int cursor2 = base2; // Indexes int a
int dest = base1; // Indexes int a
// 把run1写入临时数组
System.arraycopy(a, base1, tmp, cursor1, len1);

// Move first element of second run and deal with degenerate cases
// 因为run1前面一部分被略过了,剩下的部分里一定是run2的第一个元素在开头
a[dest++] = a[cursor2++];
// 如果run2只有一个元素,插入到头部,剩下的run1直接后移一位,排序就完成了
if (--len2 == 0) {
System.arraycopy(tmp, cursor1, a, dest, len1);
return;
}
// 因为run2后面一部分被略过了,剩下的部分里一定是run1的最后一个元素在末尾
// 如果run1只有一个元素,插入到末尾,剩下的run2直接前移一位,排序就完成了
// 直接return了,所以len1归不归零都无所谓
if (len1 == 1) {
System.arraycopy(a, cursor2, a, dest, len2);
a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
return;
}

Comparator<? super T> c = this.c; // Use local variable for performance
// 用新的变量,本次对阈值minGallop的修改不会影响下次归并
int minGallop = this.minGallop; // " " " " "
outer:
while (true) {
int count1 = 0; // run1连续比run2小多少次
int count2 = 0; // run2连续比run1小多少次

/*
* Do the straightforward thing until (if ever) one run starts
* winning consistently.
*/
// 正常的归并操作,但是如果其中一个run的头部连续minGallop次都比另一个run小,就先中止当前的归并,退出循环做一些优化
do {
assert len1 > 1 && len2 > 0;
if (c.compare(a[cursor2], tmp[cursor1]) < 0) {
a[dest++] = a[cursor2++];
count2++;
count1 = 0;
//如果其中一个run空了,表示归并完成,退出最外层循环
if (--len2 == 0)
break outer;
} else {
a[dest++] = tmp[cursor1++];
count1++;
count2 = 0;
if (--len1 == 1)
break outer;
}
// count1和count2总有一个是0,或运算得到的就是不为0的那个
} while ((count1 | count2) < minGallop);

/*
* One run is winning so consistently that galloping may be a
* huge win. So try that, and continue galloping until (if ever)
* neither run appears to be winning consistently anymore.
*/
// 如果runA的头部连续minGallop次比runB小,那么很可能其后面很长一部分都比runB的头部小。这时逐个元素进行归并效率就比较低。所以调用gallop方法,找到runB的头部在runA中插入的位置,使用arraycopy直接把runA中该位置之前的片段一次性复制到a中排好序的位置上,比归并的效率高
do {
assert len1 > 1 && len2 > 0;
count1 = gallopRight(a[cursor2], tmp, cursor1, len1, 0, c);
if (count1 != 0) {
System.arraycopy(tmp, cursor1, a, dest, count1);
dest += count1;
cursor1 += count1;
len1 -= count1;
// run1最后一个元素一定是放在末尾,只剩一个元素就不需要归并了,如果run1空了,说明出了bug
if (len1 <= 1) // len1 == 1 || len1 == 0
break outer;
}
a[dest++] = a[cursor2++];
if (--len2 == 0)
break outer;
// 找完run1的头部序列,顺便找run2的头部序列,因为如果run1头部很多元素比run2小,在除去这些元素后,就很有可能run2的头部有很多元素比run1大。
// 又不是找尾部序列,为什么要调用gallopLeft?
count2 = gallopLeft(tmp[cursor1], a, cursor2, len2, 0, c);
if (count2 != 0) {
System.arraycopy(a, cursor2, a, dest, count2);
dest += count2;
cursor2 += count2;
len2 -= count2;
if (len2 == 0)
break outer;
}
a[dest++] = tmp[cursor1++];
if (--len1 == 1)
break outer;
// 动态调整阈值,可能是根据经验觉得会有效吧
minGallop--;
// 如果连续性还是很大,就继续调用gallop替代归并
} while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
if (minGallop < 0)
minGallop = 0;
// 跳出循环,说明连续性没那么高了,可以适当提高阈值
minGallop += 2; // Penalize for leaving gallop mode
} // End of "outer" loop
this.minGallop = minGallop < 1 ? 1 : minGallop; // Write back to field

// 跳出循环只有两种情况,要么run1只剩一个元素,要么run2空了
if (len1 == 1) {
assert len2 > 0;
System.arraycopy(a, cursor2, a, dest, len2);
a[dest + len2] = tmp[cursor1]; // Last elt of run 1 to end of merge
} else if (len1 == 0) {
// run1最后一个元素一定是放在末尾,如果run1空了,说明出了bug
throw new IllegalArgumentException(
"Comparison method violates its general contract!");
} else {
assert len2 == 0;
assert len1 > 1;
System.arraycopy(tmp, cursor1, a, dest, len1);
}
}

/**
* Like mergeLo, except that this method should be called only if
* len1 >= len2; mergeLo should be called if len1 <= len2. (Either method
* may be called if len1 == len2.)
*
* @param base1 index of first element in first run to be merged
* @param len1 length of first run to be merged (must be > 0)
* @param base2 index of first element in second run to be merged
* (must be aBase + aLen)
* @param len2 length of second run to be merged (must be > 0)
*/
private void mergeHi(int base1, int len1, int base2, int len2) {
assert len1 > 0 && len2 > 0 && base1 + len1 == base2;

// Copy second run into temp array
T[] a = this.a; // For performance
T[] tmp = ensureCapacity(len2);
int tmpBase = this.tmpBase;
System.arraycopy(a, base2, tmp, tmpBase, len2);

int cursor1 = base1 + len1 - 1; // Indexes into a
int cursor2 = tmpBase + len2 - 1; // Indexes into tmp array
int dest = base2 + len2 - 1; // Indexes into a

// Move last element of first run and deal with degenerate cases
a[dest--] = a[cursor1--];
if (--len1 == 0) {
System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
return;
}
if (len2 == 1) {
dest -= len1;
cursor1 -= len1;
System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
a[dest] = tmp[cursor2];
return;
}

Comparator<? super T> c = this.c; // Use local variable for performance
int minGallop = this.minGallop; // " " " " "
outer:
while (true) {
int count1 = 0; // Number of times in a row that first run won
int count2 = 0; // Number of times in a row that second run won

/*
* Do the straightforward thing until (if ever) one run
* appears to win consistently.
*/
do {
assert len1 > 0 && len2 > 1;
if (c.compare(tmp[cursor2], a[cursor1]) < 0) {
a[dest--] = a[cursor1--];
count1++;
count2 = 0;
if (--len1 == 0)
break outer;
} else {
a[dest--] = tmp[cursor2--];
count2++;
count1 = 0;
if (--len2 == 1)
break outer;
}
} while ((count1 | count2) < minGallop);

/*
* One run is winning so consistently that galloping may be a
* huge win. So try that, and continue galloping until (if ever)
* neither run appears to be winning consistently anymore.
*/
do {
assert len1 > 0 && len2 > 1;
count1 = len1 - gallopRight(tmp[cursor2], a, base1, len1, len1 - 1, c);
if (count1 != 0) {
dest -= count1;
cursor1 -= count1;
len1 -= count1;
System.arraycopy(a, cursor1 + 1, a, dest + 1, count1);
if (len1 == 0)
break outer;
}
a[dest--] = tmp[cursor2--];
if (--len2 == 1)
break outer;

count2 = len2 - gallopLeft(a[cursor1], tmp, tmpBase, len2, len2 - 1, c);
if (count2 != 0) {
dest -= count2;
cursor2 -= count2;
len2 -= count2;
System.arraycopy(tmp, cursor2 + 1, a, dest + 1, count2);
if (len2 <= 1) // len2 == 1 || len2 == 0
break outer;
}
a[dest--] = a[cursor1--];
if (--len1 == 0)
break outer;
minGallop--;
} while (count1 >= MIN_GALLOP | count2 >= MIN_GALLOP);
if (minGallop < 0)
minGallop = 0;
minGallop += 2; // Penalize for leaving gallop mode
} // End of "outer" loop
this.minGallop = minGallop < 1 ? 1 : minGallop; // Write back to field

if (len2 == 1) {
assert len1 > 0;
dest -= len1;
cursor1 -= len1;
System.arraycopy(a, cursor1 + 1, a, dest + 1, len1);
a[dest] = tmp[cursor2]; // Move first elt of run2 to front of merge
} else if (len2 == 0) {
throw new IllegalArgumentException(
"Comparison method violates its general contract!");
} else {
assert len1 == 0;
assert len2 > 0;
System.arraycopy(tmp, tmpBase, a, dest - (len2 - 1), len2);
}
}

/**
* Ensures that the external array tmp has at least the specified
* number of elements, increasing its size if necessary. The size
* increases exponentially to ensure amortized linear time complexity.
* 保证临时数组tmp能够容纳下minCapacity个元素,不足时扩容
*
* @param minCapacity the minimum required capacity of the tmp array
* @return tmp, whether or not it grew
*/
private T[] ensureCapacity(int minCapacity) {
// 只有长度不够才需要扩容,足够时tmp不需要修改
if (tmpLen < minCapacity) {
// Compute smallest power of 2 > minCapacity
// 计算最小的大于minCapacity的2的幂,花里胡哨的位运算,相当于把minCapacity最高位的1后面全置为1,最后newSize++,让最高位的1左移一位,后面全变成0,就得到了2的幂。因为java的int是32位,所以到>> 16就截止了。
int newSize = minCapacity;
newSize |= newSize >> 1;
newSize |= newSize >> 2;
newSize |= newSize >> 4;
newSize |= newSize >> 8;
newSize |= newSize >> 16;
newSize++;
// 小于0是溢出了?
if (newSize < 0) // Not bloody likely!
newSize = minCapacity;
else
// minCapacity是归并的两个子数组里较短的子数组的长度,所以不可能超过a.length/2
newSize = Math.min(newSize, a.length >>> 1);

// 根据扩容后的长度重新创建临时数组
@SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
T[] newArray = (T[])java.lang.reflect.Array.newInstance
(a.getClass().getComponentType(), newSize);
tmp = newArray;
tmpLen = newSize;
tmpBase = 0;
}
// 为什么tmpLen和tmpBase可以直接修改,tmp要作为返回值返回,因为tmp是数组???
return tmp;
}
}