I have talked about some points for efficiency in last several articles, so I will select some important points this time. Their common feature is to reduce the unnecessary duplication. For example, in week 4: recursion in nested list and turtle, I have mentioned a method from our reading to improve efficiency in recursion, which is to use a variable(usually dictionary) to record the result from every iteration in order to avoid the repetitive work.
Moreover, sorting algorithm is a great case in the discussion on efficiency. For instance, the principal improvement of Selection Sort compared with Bubble Sort is to make only one exchange for every pass. In the worst case, both of them will make (n-x) comparisons in the x-th pass for n items in the lists. However, Bubble Sort will also make one exchange in every pass, whereas Selection Sort will just remember the maximum or minimum and make the exchange once, which can almost save half the time than Bubble Sort. The drawback of Selection Sort is that it will still check every items in spite of average case or the best case while Bubble Sort not.(i.e. the time complexity will not change in either the worst case or the best case.) Then the creation of Insertion Sort solves this problem by combining the advantages of both Selection Sort and Bubble Sort. Insertion Sort will not make effort to find the maximum or minimum but the "proper" position for the item in every pass like what Bubble Sort does. In the meantime, it will not make exchange every time but record the position and make only one exchange after finding the proper position like what Selection does.
Nevertheless, the efficiency of Insertion Sort still can't satisfy the demand of our computer scientists. (lol!!!) Grouping is a good method to increase efficiency in our real life. Applying grouping in computer science, we have more amazing algorithm such as Shell Sort, Quick Sort , Merge Sort and Timsort. Shell Sort divide the lists to sublists by a given "gap". After that, the most part of the list will have been sorted and the task of the last basic Insertion Sort would be simple. However, the time complexity in Shell Sort depends on the "gap" we choose.
Quik Sort is a similar version of Binary Search Tree. Choose a pivot value and then divide the list to the part of the values less than the pivot and greater than the pivot. The Merge Sort, which is my favourite algorithm, just needs nlogn time complexity in any case because we know that it needs logn times to divide a list in half (i.e n /2 /2 /2...=2 -> 2 * 2 * 2 ... = n, then the number of 2 in the equation is logn). In the merge process, every items just likely compare with each other once, so the operation in total is still n. Then we get nlogn time complexity for the Merge Sort. The defect of Merge Sort is in the best case, the time complexity is still nlogn which is larger than n for Bubble Sort and Insertion Sort as we just need to make (n - 1) comparisons in the best case for both of them. Timsort manages to remedy this by combining Merge Sort with Insertion Sort as I have discussed in the last blog post.
In addition, it seems that the improvement of the algorithm on time complexity sometimes can lead to the increase of space complexity such as Merge Sort, Quik Sort and Timsort. I find another algorithm - Heapsort which can balance them.
About Asymptotic Analysis
As what I have talked in the last post, this topic is a little bit overlapped with the content of CSC165. The difference is that while CSC165 focuses on how to find an formula to express the runtime complexity of a function and its corresponding asymptotic analysis (i.e. Big-oh, Big-Omega, Big-Theta), CSC148 pays more attention to the explanation for the time complexity of some given sorting algorithm. Nevertheless, they are close related to each other. In the last paragraph, I have analyzed how to get the time complexity of Merge Sort --- nlogn from the degree of CSC148, but we still can't obtain O(nlogn) without a detailed proof. I will make a brief proof for sorting algorithm so as to be more logic. Let's take an example of Bubble Sort:
Assume the runtime of this function is T(n) for n is the number of items in the list. In the function bubbleSort, we can see two "for" loops. And the inner loop also has a "if" judgement sentence. As we are talking about Big-oh, we assume this is in the worst case so that every sentence in the "if" sentence will be always operated in every loop. For every inner loop, we get 4 times operations. Then we have 4 * passnum in every outer loop. Also, in the outer loop, passnum will respectively equal (n - 1, n - 2,..., 2, 1). Then we get T(n) equal to 4(n - 1 + n - 2 + ... + 2 + 1) = 4 * (n - 1)n / 2 = 2n^2 - 2n. Then we can simply find (2n^2 - 2n) <= 2n^2 for n >=0. Then we get T(n) in O(n^2) in the worst case.123456789101112def bubbleSort(alist):for passnum in range(len(alist)-1,0,-1):for i in range(passnum):if alist[i]>alist[i+1]:temp = alist[i]alist[i] = alist[i+1]alist[i+1] = tempalist = [54,26,93,17,77,31,44,55,20]bubbleSort(alist)print(alist)
Challenge
To some extent, Sorting Algorithm is a complex study for computer science. There are many various algorithms. Each of them has itself's features, advantages and shortages. It is difficult to memorize so many functions or principles of these algorithms, especially for those algorithms involving recursion, many loops and built-in helper functions. Furthermore, the computation and the proof of the Asymptotic Analysis is really a challenge because it needs many mathematic knowledge. In addition, I saw it says that "According to information theory, no comparison sort can perform better than comparisons in the worst case." I am still wondering the reason...
Achievement
I think the topic about algorithm will be accompanying with me in my university study as I will choose many mathematics courses. In fact, I like logic analysis for many things so it is really interesting to dig some deeper theorems or applications about it. To my disappointment, CSC148 seems just to discuss a little about them as they were not included in any assignment and exercises. However, the last question of test 2 is a nice example which we could spend more time analyzing (one hour is not enough for a detailed one...) Anyway, they cannot stop my forward steps as they couldn't in the past, either! Come on! Final!
1
2
3
4
5
6
7
8
9
10
11
12