Multi Threaded File IO

Thu Jan 31 2019

Background

Suppose that you have a large quantity of files that you want your program to ingest. Is it faster to sequentially read each file or, read the files in parallel using multiple threads? Browsing this question online will give varied answers from confused people on discussion threads. Reading resent research papers on multi threaded IO did not quite provide a clear answer my question.

A ton of people argue that multiple threads don’t increase file throughput because at the end of the day a HHD can only read one file at a time. Adding more CPU cores into the mix would actually slow down the file ingest because the HHD would have to take turns between reading fragments of different files. The seek speed of the HHD would heavily degrade the performance.



Sorting Algorithms

Sat Dec 15 2018

Insertion Sort

Selection sort although not very efficient is often used when the arrays you are sorting are very small. Another benefit of insertion sort is that it is very easy to program. Essentially, this algorithm has a sorted section which slowly grows as it pull in new elements to their sorted position.

Functional Notation

\[ s([]) = []\\ s(x::xs) = i(x, s(xs)) \]



Knapsack Problem

Tue Dec 11 2018

In this post I will go over how you can frame a question in terms of recursion and then slowly massage the question into a dynamic programming question.

Problem Description

The knapsack problem is a famous optimization problem in computer science. Given a set of items, a thief has to determine which items he should steal to maximize his total profits. The thief knows the maximum capacity of his knapsack and the weights and values of all the items.



CS Theory Exam 2 Review

Tue Nov 06 2018

This a very high level review post that I am making for myself and other people reviewing CS Theory. If you want to lean more about content in this blog post I recommend cracking open a text book– I know, gross. This hastily thrown together post will cover how to solve typical problems relating to topics covered by my second CS Theory exam.

Myhill-Nerode Theorem

Definition

L is regular if and only if it has a finite index. The index is the maximum number of elements that are pairwise distinguishable. Two strings are said to be pairwise distinguishable if you can append something to both of the strings and it makes one string accepted by the language and the other string rejected. The size of an index set X equals the number of equivalence classes it has and the minimum number of states required to represent it using a DFA. Each element in the language is accepted by only one equivalence class.

Problem Approach



Everything Fibonacci

Sun Oct 07 2018

If you have ever taken a computer science class you probably know what the fibonacci sequence is and how to calculate it. For those who don’t know: Fibonacci is a sequence of numbers starting with 0,1 whose next number is the sum of the two previous numbers. After having multiple of my CS classes give lectures and homeworks on the Fibonacci sequence; I decided to write a blog post going over the 4 main ways of calculating the nth term of the Fibonacci sequence. In addition to providing the python code for calculating the nth perm of the sequence, a proof for their validity and an analysis of their time complexities both mathematically and empirically will be examined.

Slow Recursive Definition

By the definition of the Fibonacci sequence, it is the most natural to write it as a recursive definition.

def fib(n):
    if n == 0 or n == 1:
        return n
    return fib(n-1) + fib(n-2)

Time Complexity