[MWCS] - 01 - Asymptotics
- asymptotic notation:
- \( \mathcal{O} \): Big Oh
- \( \Omega \): Omega
- asymptotic notation used as:
- large-input algorithm performance assessor
- algorithm success probability assessor
- performance assessed along lines of
- running time of algorithm (time complexity)
- analogously, memory usage by algorithm (space complexity)
- success probability of an algorithm is explored in terms of
- negligible
- noticeable
- overwhelming
case study:
need for asymptotic notation
problem statement
- consider the problem of reading integers from
- a file whose number of lines are not known
- a standard input stream
- the goal is to read these integers into numpy arrays
- numpy arrays used to be of fixed size
- consider this past situation to illustrate need for asymptotic notation
- numpy arrays used to be of fixed size
study process
- two algorithms are studied
- incremental array size increase
- doubling array size as array spaces fill up
analytical time assessment
- for reading \( n \) lines of integers
- with other symbols as time constants for other parts of the algorithm,
- time taken by each algorithm is analytically derived to be
algorithm 1 \[ T_1(n) = nd + \frac{n(n-1)}{2}c + e \]
algorithm 2 \[ T_2(n) < (4C+D)n + A \]
- for large enough number of lines in input, only the terms of the highest order matter, so
- \( T_1(n) \approx 0.00013 n^2 \text{ ms }\)
- \( T_2(n) < 0.0021n \text{ ms }\)
- \( T_1(n) \approx 0.00013 n^2 \text{ ms }\)
payoff noticed
- even with taking disparity hits in the constants, the second algorithm would outperform the first algorithm
- 2 seconds (2nd algorithm) to 22 minutes (1st algorithm)
- the first algorithm’s time complexity is parabolic for large inputs
- while the second one’s is linear
Big Oh notation
- Big Oh, denoted by \( \mathcal{O} \), is a notation for understanding time complexity of an algorithm
- expresses the upper bound for an algorithm’s running time
- as the number of inputs tends to infinity
- hence the name asymptotic notation
- in algorithm #2 above, the upper bound for the runtime is given by
- \( T_2(n) < 0.0021n \text{ ms }\)
- an this holds only for when \( n > 150 \)
- this time upper bound of algorithm #2 for \( n > 150 \) is expressed in Big Oh notation as
\[ T_2 \in \mathcal{O}(n) \]
- this is the same as saying “for sufficiently large \(n\), \(T_2(n) < dn \)”
- where
- \(d\) is some constant
- \(T_2(n)\) is time taken by algorithm #2 to process \(n\) lines
- the big O is a measure for how time taken by an algorithm scales when the number of inputs increase for large setts of inputs
formally
- assume two positive functions \(f\) and \(g\)
- \(g\) is the upper bound for \(f\)’s runtime
- the notation \( f \in \mathcal{O}(g) \) (\(f\) belongs to Big Oh \(g\)) means that there exists
- a \(c\) such that \(c \geq 0\)
- a \( n_0 \) such that \( n \geq n_0 \)
- in whose bounds \( f(n) \leq c \cdot g(n) \)
mathematical assumptions for Big Oh
- functions are positive
- i.e. functions whose values are real
- function inputs are natural numbers
- and starting from some point \( n_0 \)
- the function (\(f(n)\)) we are interested in is either
- the run time amount (of algorithm for given input size)
- the memory consumption amount (of algorithm for given input size)
-
\(g(n)\) specifies the upper limit
-
ignore values of \(f\) and \(g\) for small \(n\)
- ignore constant factors in bounding function in the notation
criteria for Big Oh
criterion 1
- given positive functions \(f\) and \(g\)
- \( f \in \mathcal{O}(g) \) if and only if the ratio \( \frac{f(n)}{g(n)} \) is bounded for sufficiently large \(n\)
- i.e. there exists \(c\) and \(n_0\) such that \( \frac{f(n)}{g(n)} \leq c \) for all \( n \geq n_0\)
criterion 2
- given positive functions \(f\) and \(g\)
- if \( \lim_{n \rightarrow \infty } \frac{f(n)}{g(n)} \) exists
- then \( f \in \mathcal{O}(g) \) if and only if \( \lim_{n \rightarrow \infty } \frac{f(n)}{g(n)} < \infty \)
- in some cases, criterion 2 cannot be applied since the limit doesn’t exist
- in such cases, ensure ratio is bounded on both sided for sufficiently large \(n\)
- i.e. check for criterion 1 only, but both less than and greater than
- in such cases, ensure ratio is bounded on both sided for sufficiently large \(n\)
Big Oh lemma
general lemma
- if \( f \in \mathcal{O}(g) \) and \( a > 0 \)
- then \( a \cdot f \in \mathcal{O}(g) \)
- if \( f_1 \in \mathcal{O}(g_1) \) and \( f_2 \in \mathcal{O}(g_2) \)
- then \( f_1 + f_2 \in \mathcal{O}(g_1 + g_2) \)
- also, \( f_1 \cdot f_2 \in \mathcal{O}(g_1 \cdot g_2) \)
- if \( f_1, f_2 \in \mathcal{O}(g) \)
- then \( f_1 + f_2 \in \mathcal{O}(g) \)
- if \( f_1 \in \mathcal{O}(f_2) \) and \( f_2 \in \mathcal{O}(g) \)
- then \( f_1 \in \mathcal{O}(g) \)
- if \( g_1 \in \mathcal{O}(g_2) \)
- then \( g_1 + g_2 \in \mathcal{O}(g_2) \)
- if \( f \in \mathcal{O}(g) \) and \( \lim_{n \rightarrow \infty} b(n) = \infty \)
- then \( \sum_{i=1}^{b(n)} f(i) \in \mathcal{O}( \sum_{i=1}^{b(n)} g(i) )\)
common functions lemma
- \( p(n) \in \mathcal{O} (n^k) \)
- if \(p\) is a polynomial of degree \(k\)
-
\( n^k \in \mathcal{O}(a^n) \) for any constant real numbers \( a > 1 \) and \( k > 0 \)
-
\( (\log_a n)^k \in \mathcal{O}(n^l) \) for any constant real numbers \( a > 1, k > 0 \) and \( l > 0 \)
-
\( \log_a n \in \mathcal{O}(log_b n) \) for any constant real numbers \(a > 1 \) and \( b > 1 \)
- \( a^n \in \mathcal{0}(b^n) \) for any constant real numbers \( b \geq a > 0 \)
Big Omega notation
- Big Oh notation is to express time taken by algorithm in terms of
- not more than philosophy
- upper bound for time taken
- assess and compare speeds of algorithms
- Big Omega notation is to express time taken by algorithm in terms of
- atleast so much philosophy
- lower bound for time taken
- used in security to ensure encryption algorithms take atleast so much time to break
formally
- assume two positive functions \(f\) and \(g\)
- \(g\) is the lower bound for \(f\)’s runtime
- \(f\) grows asymptotically no slower than \(g\)
- the notation \( f \in \Omega(g) \) (\(f\) belongs to Big Omega \(g\)) means that there exists
- a \(c\) such that \(c \geq 0\)
- a \( n_0 \) such that \( n \geq n_0 \)
- in whose bounds \( f(n) \geq c \cdot g(n) \)
- not \( f(n) \leq c \cdot g(n) \) for Big Oh notation
criteria for Big Omega
- given positive functions \(f\) and \(g\)
- if \( \lim_{n \rightarrow \infty } \frac{f(n)}{g(n)} \) exists
- then \( f \in \Omega(g) \) if and only if \( \lim_{n \rightarrow \infty } \frac{f(n)}{g(n)} > 0 \)
- unlike for Big Oh notation:
- \( f \in \mathcal{O}(g) \) if and only if \( \lim_{n \rightarrow \infty } \frac{f(n)}{g(n)} < \infty \)
- unlike for Big Oh notation:
Big Oh and Big Omega relationship
- they are tightly related
- \( g \in \Omega(f) \) if and only if \( f \in \mathcal{O}(g) \)
- keeping this in mind, the lemmas of Big Oh can be extended to Big Omega
- with mathematical modifications