我們如何假設對數字的基本運算需要固定的時間?


77

通常,在算法中,我們不關心數字的比較,加法或減法-我們假定它們在時間$ O(1)$內運行。例如,當我們說基於比較的排序是$ O(n \ log n)$時,我們就假設了這一點,但是當數字太大而無法放入寄存器時,我們通常將它們表示為數組,因此基本操作需要每個元素進行額外的計算。

是否有證據表明可以在$ O(1)$中完成兩個數字(或其他原始算術函數)的比較?如果不是,我們為什麼要說基於比較的排序是$ O(n \ log n)$?


當我回答一個SO問題時遇到了這個問題,我意識到我的算法不是$ O(n)$,因為遲早我應該處理big-int,這也不是偽多項式時間算法,它是$ P $。

7

I couldn't find any studies of this, but Kozen says in the introduction to "The Design and Analysis of Algorithms" that the $O(1)$ model "reflects experimental observation more accurately [than the log-cost model] for data of moderate size (since multiplication really does take one unit of time)." He also gives a reference to this paper as an example of how the $O(1)$ model can be abused.

This is absolutely not a legit evaluation (not least because it's Python), but here's some numbers from running python -mtimeit "$a * $b" for $a in $10^{\{1,2,...,66\}}$ and $b = 2*$a. (I stopped at 66 because that's when Python syntax stops accepting integer literals and I'd have to slightly switch my evaluation code, so I didn't. :p)

Each number is the mean of 10,000,000 loops, where it takes the best time of 3 runs in each loop. I'd do error bars or something but that'd be more effort. :p In any case, it looks pretty constant to me, even out to $10^{50}$ -- slightly surprising, since $\log_{10}(\tt{sys.maxint})$ is 43, which reinforces my suspicion that maybe this evaluation is particularly bogus and I should do it in C.


25

It just depends on the context. When dealing with bit level complexity of algorithms, we do not say that the addition of two $n$ bits numbers is $O(1)$, we say it is $O(n)$. Similarly for multiplication etc.


7

You are right, in general we cannot assume they are $O(1)$.

Strictly speaking, if we want to sort an array with N numbers using comparisons, and the largest number is M, then on the worst case, each comparison may involve $O(\log M)$ comparisons on the bit level. And if our algorithm does $O (N \log N)$ comparisons, then its total complexity will be $O (N \log N \log M)$.

However, you will notice the difference only for very large values of $M$, that cannot be stored in a single register, as you can see from Dougal's experiment.


16

To answer the question as stated: algorithmists just do it, fairly often, by using the RAM model. For sorting, in many cases, people even analyze the simpler comparison model, which I discuss a little more in the linked answer.

To answer the implicit question about why they do it: I would say that this model has pretty good predictive power for certain types of combinatorial algorithms, in which the numbers are all "small" and, on real machines, fit in registers.

To answer implicit follow-up about numerical algorithms: No, the plain old RAM model is not the standard here. Even Gaussian elimination can require some care. Typically, for rank computations, the Schwartz Lemma will enter (e.g., Section 5 here). Another canonical example is the analysis of the Ellipsoid Algorithm, which requires some care to analyze.

And finally: People have thought about string sorting before, even recently.

Update: The problem with this question is that "we" and "assume" are not so precisely specified. I would say that the people who work in the RAM model are not doing numerical algorithms or complexity theory (where determining the complexity of division was a celebrated result).


4

I would say that we typically assume O(1) arithmetic operations because we're usually doing things in the context of 32-bit integers or 64-bit integers or IEEE 754 floating point numbers. O(1) is probably a pretty good approximation for that kind of arithmetic.

But in general, that's not true. In general you need an algorithm to perform addition, subtraction, multiplication and division. Boolos, Burgess and Jefferies' Computability and Logic springs to mind as a way to understand the proof(s) of that, in terms of a couple of different formal systems, Recursive Functions and Abacus Machines, at least, in my 4th Edition copy.

You can look at the lambda-calculus terms for subtraction and division with Church Numerals for an easy-to-see explanation of why those two operations aren't O(1). It's a bit harder to see for addition and multiplication and exponentiation, but it's there if you consider the form of Church Numerals themselves.


79

For people like me who study algorithms for a living, the 21st-century standard model of computation is the integer RAM. The model is intended to reflect the behavior of real computers more accurately than the Turing machine model. Real-world computers process multiple-bit integers in constant time using parallel hardware; not arbitrary integers, but (because word sizes grow steadily over time) not fixed size integers, either.

The model depends on a single parameter $w$, called the word size. Each memory address holds a single $w$-bit integer, or word. In this model, the input size $n$ is the number of words in the input, and the running time of an algorithm is the number of operations on words. Standard arithmetic operations (addition, subtraction, multiplication, integer division, remainder, comparison) and boolean operations (bitwise and, or, xor, shift, rotate) on words require $O(1)$ time by definition.

Formally, the word size $w$ is NOT a constant for purposes of analyzing algorithms in this model. To make the model consistent with intuition, we require $w \ge \log_2 n$, since otherwise we cannot even store the integer $n$ in a single word. Nevertheless, for most non-numerical algorithms, the running time is actually independent of $w$, because those algorithms don't care about the underlying binary representation of their input. Mergesort and heapsort both run in $O(n\log n)$ time; median-of-3-quicksort runs in $O(n^2)$ time in the worst case. One notable exception is binary radix sort, which runs in $O(nw)$ time.

Setting $w = \Theta(\log n)$ gives us the traditional logarithmic-cost RAM model. But some integer RAM algorithms are designed for larger word sizes, like the linear-time integer sorting algorithm of Andersson et al., which requires $w = \Omega(\log^{2+\varepsilon} n)$.

For many algorithms that arise in practice, the word size $w$ is simply not an issue, and we can (and do) fall back on the far simpler uniform-cost RAM model. The only serious difficulty comes from nested multiplication, which can be used to build very large integers very quickly. If we could perform arithmetic on arbitrary integers in constant time, we could solve any problem in PSPACE in polynomial time.

Update: I should also mention that there are exceptions to the "standard model", like Fürer's integer multiplication algorithm, which uses multitape Turing machines (or equivalently, the "bit RAM"), and most geometric algorithms, which are analyzed in a theoretically clean but idealized "real RAM" model.

Yes, this is a can of worms.