CSC148H1 SLOG: 2014

Wednesday, 2 April 2014

Week 12 SLOG

So, this is the final week of lectures for CSC148. Time sure passes really quickly.

From what a few of my friends told me about upper year compsci courses, it would seem that this semester would probably the last time I do an assignment in Python. Since this will be the final entry of my course blog, I think it would be fitting to share my thoughts about the Python programming language.

My first impressions about Python is not exactly favourable. I remembered when I first used the Python IDE, I remarked that it runs at the pace of a tortoise compared to another programming language I am used to - C++. However, over time, I began to really appreciate the simplicity and the intuitive features in Python - no #include <iostream> and #include <string> for you, which allows the coder to concentrate on the code itself instead of the technicalities, which, I would like to warn anyone without prior knowledge of C based languages and is planning on doing 2nd year csc courses, can be a huge pain.

Of course, that is not all - since Python itself runs on C, there is a plethora of safety features that prevents, say an accidental wipe of your whole hard drive due to faulty code (C is particularly infamous for that). This aspect of Python really makes life so much simpler.

However, when all is said and done, Python is still not a programming language that is string enough to be used for serious coding. That said, Python is a very good scripting language that can be used in conjunction with other programming languages (if the compiler supports that feature).

Wednesday, 26 March 2014

Week 11 SLOG

Since Timsort is not covered in class, this blog seems like the perfect place to write a few lines about the built-in sorting algorithm in Python. Timsort is a relatively new sorting algorithm, invented by Tim Peters for the Python programming language, and is arguably one of the best sorting algorithms invented so far.

The first thing to note about Timsort is that it is a hybrid sorting algorithm, utilizing both merge sort and insertion sort, and exploiting their respective strengths to achieve a worst case runtime of O(nlogn), which is best a modern sorting algorithm can achieve without the help of customized data structures or hash tables.

One of the central concepts of Timsort is the idea that any random list has a high probability of containing at least one sorted sublist, called 'run', which can be either non-descending or strictly descending. And then there is 'minrun', which is the minimum length of run required by the implementation of Timsort.

First, the algorithm searches for a run, if the run is shorter than the minrun, then items are removed from the remaining list and inserted into the correct position in the list. This process is repeated until all runs are obtained. After that, insertion sort, which is best for short lists, is used to sort the runs. When all of that is done, merge sort is used to merge (and sort) the runs and the remaining list.

Of course, the usage of merge sort and insertion sort in optimal conditions alone will not itself guarantee efficiency of the Timsort algorithm. The part that makes it fast depends on the size of the minrun and the management of the remaining list.

Wednesday, 19 March 2014

Week 10 SLOG

Regular Expressions (regexes) are found everywhere in computer interfaces, even if they seem to be obscured from the typical user. Our Csc148H1 assignment 2 is all about working with them, and since then, I have developed a deep respect for the people involved in the coding of regexes, because while the programming process seems to be easy at first glance, it is actually very complicated. There are always cases you have not considered and even when you do, you might get it wrong. For example, the star '*' case might seem very easy to code for at first, but it actually involves creating slices of the original string to be compared with the remaining regex.

When I mentioned that regexes are sometimes obscured, I'm talking about how search engines such as Google Search parses strings in the form 'r1 r2 r3' as r1.r2.r3. When you think about it, there is a lot going underneath the shiny interfaces that we encounter in our daily virtual lives.

Thursday, 13 March 2014

Week 9 SLOG

This week's class was primarily about the concept, implementation, and the performance advantage of the sorted binary tree data structure when it comes to searching. But while the worst case O(logn) runtime is impressive, there is actually another search method that eclipses the search power of the sorted binary tree - searching by means of a hash table.

First, a hash table is a array-like data structure that has a key and an associated value (similar to the dict type in Python). the nature of the value varies depends on the usage of the hash table. In the case of storage of raw data, the value is the memory address of the file/data. For the purpose of, say storing a list of numbers, the value *can* be the numbers themselves. So what are the keys? To obtain a key, a hash function is used to designate each value a unique number. So, when a user is searching for a particular number, the search engine only has to use the hash function to get a key, which can then be used to get the value directly from the table.

With hash tables, the worst case runtime is in the O(1) - constant time. Of course, the implementation is not easy and I would not be surprised if I only get to learn them in my third year.

Wednesday, 5 March 2014

Week 8 SLOG

During the reading week, I stumbled upon metaclasses while looking for some ideas/inspiration for the second assignment. What that began as a misunderstanding (as always) of the requirements of the first part of a2 (make public attributes read-only) became a really (valuable?) learning experience when I found out that not only are all classes in Python objects, they are instances of metaclasses, the most common being the metaclass type. type(), for the uninitiated actually has 2 functions.

The first, of course, is to return the data type (note that since Python does not have data types in the traditional sense, the term does not necessarily apply) of an object. To do that, only one argument, the object itself has to be passed to type().

What most Python users don't know is that type can alternately act as a class 'factory', accepting three arguments, and returning a class (which is actually an object itself). The first is a str with the desired name of the class to be returned, the second is a tuple of the classes to inherit from, and the third is a dict of the desired class attributes. Note that class attributes are the variables that can be accessed by calling <class name>.<variable name>, and the embedded methods in the class.

So what exactly do metaclasses do? Well, when used outside of class definitions, it just returns a class from the arguments passed. The interesting part is when a metaclass argument is passed to class defintions. To do so, Python users can either write a function (or class method - very complicated) which returns a class and then add the argument metaclass = <desired metaclass> to the class defintion. When the class is evaluated prior to code execution, the metaclass will override the __new__ method in the class (__new__ is the default metaclass, which returns a basic class object), thus modifying the class before instances of the new class can be created.

Of course, that is not all Python users can do with metaclasses. Should the metaclass be defined as a class itself, there are many other things that can be done. But as I said, doing that is extremely complicated (no kidding!) and I'm not necessarily confident enough to write it here yet.

Wednesday, 26 February 2014

Week 7 SLOG

This week, I am supposed to talk about recursion. Since I have already did an entry on the basics of that particular technique of coding, this week I shall discuss about a question that is frequently asked in computer science - Are there algorithms that can only be solved recursively?

Well, it turns out that the answer is yes and no.

No, theoretically, an advanced programmer may be able to make use of a Last-in-first-out (LIFO) stack that keeps a running list of the remaining tasks to handle the whole recursive process in a while loop which continues to push and pop tasks until the list is empty. This method is the most common among all the other workarounds, and is recognized by making the observation that recursion in programming is really just a special case of iteration.

On the other hand, in mathematical terms, yes, there are indeed certain problems that require recursion to arrive at an answer. A simple example is the Newton-Raphson method (learned in mat137). Another famous example is the Ackermann function. Of course, as mentioned earlier, in coding, one can always manage a stack to emulate the recursive calls on the function itself.

Thursday, 13 February 2014

Week 6 SLOG

This week we finally learned about trees. So why do we need to learn trees and other recursively defined data structures? Well, there are actually quite a few reasons why people went through all this trouble to come up with the idea. Here I've compiled a short list of the benefits of using trees and other recursive data structures:

Simpler and more readable code.
No need to preserve state on each iteration.
More often than not more memory-efficient.
Easier to search for items, since tree traversals can be easily coded.

Thursday, 6 February 2014

Week 5 Slog

This week I am going to talk a little bit about inheritance and composition in the context of object oriented programming (OOP).

Inheritance, in the context of OOP, refers to the method of basing a class (called a subclass) on another class (called a superclass). Python, in particular, allows the developer to inherit from more than one superclass, with a pretty convoluted reference hierarchy. Composition, on the other hand, refers to the method of embedding the class to be based on as a instance attribute of the new class. These 2 methods sport a very distinct syntax across all programming languages and are based on two very different programming philosophies.

So, the big question is - inheritance or composition? Well, the answer is, it depends. The first thing to consider is the programming language in question. Some programming languages offer one and some the other, while others offer clearer syntax for one than the other. But when it comes to high level, widely used programming languages like all the C based languages and a few others, it helps to have a simple guideline to follow when a decision has to be made.

Here are a few general strengths and weaknesses of inheritance and composition that can be helpful in making the choice:

It is easier to change the interface of a back-end class (composition) than a superclass (inheritance).
It is easier to change the interface of a front-end class (composition) than a subclass (inheritance).
Composition allows you to delay the creation of back-end objects until (and unless) they are needed, as well as changing the back-end objects dynamically throughout the lifetime of the front-end object.
It is easier to add new subclasses (inheritance) than it is to add new front-end classes (composition).
The delegation approach of composition will often have a performance cost as compared to inheritance's single invocation

Thursday, 30 January 2014

week 4 SLOG

This week I learned a great deal more about recursive functions, and I finally finished my first assignment, which admittedly drove me up the wall when it comes to the part where I have to code not 1, not 2, but 3 recursive functions. This amount of practice, while taxing - both mentally and physically (I couldn't sleep well because I can't stop thinking about it), led me to the realization that some problems are just impractical or plain impossible to code without using recursion. Prior to this week's class, and my work on the assignment, my opinion about recursion has not been favourable. I had always thought that recursive functions are fat, ugly functions that are not worth the effort, but now I
know better. Recursion is simple and elegant, and has the uncanny ability to reduce the length of code significantly. This is not to say that the reduction process is easy though - it is not. So, here I would like to, like last week list down a few short points for my future reference:

Recursion should be used when the problem cannot be, or is impractical to be solved using loops alone
The trick to coding a recursive function is to figure out the base case(s) - there are more often than not more than one.
If it is too difficult to figure out the base case(s), consider writing a more convoluted version of the function, and then reduce it.
Induction thinking is generally going to help.
Always make sure that the recursion will eventually end.

Monday, 20 January 2014

week 3 slog

For those who are reading this post, I have a confession to make. I admit that despite having considerable prior self-taught knowledge in programming (C++) before being offered a place at the University of Toronto, I have not been a fan of object-oriented programming (OOP) until last semester, where I learnt about it for the second time in CSC108H1. Maybe it was experience, maybe it was the lack of necessity, but the programs that I wrote before U of T, I wrote every single one of them with only loops and functions - no classes, no objects. Now that I have been, should I say, reintroduced to OOP, my previous programs are starting to look very disorganised and littered with multiple redundant variables and functions.

Here, I should outline 2 of the main benefits of using OOP in programming as a reminder to myself, and for those of you who are still in doubt of the strengths of OOP:

Increased Modularity - modularity has recently become a norm in most programing languages nowadays following the increasing mainstream-ness of programming itself. While one might argue that too much modularity might result in the general reduction in competence amongst programmers as a result of the oversimplification (mostly due to the increasing availability of hefty libraries online) of the process of coding itself, I personally think that the benefits still provides more than enough justification for the heavy usage of the modular approach.
Simplify the Variable Creation (and Naming) Process - the concept of creating objects out of classes is one that enables us to associate certain variables or attributes to objects. The same name can be assigned to different instance variables by stringing it to its object. So, for example, a variable name like 'player_score' can be assigned to two players, each of which is an instance of a class called player, as player1.player_score and player2.player_score. This allows for a greater degree of accuracy for the coder him/herself.