# Concepts

## Data Structures

What is a data structure? Well it's quite simply, a way of structuring data. Specifically it is a way to store and organize data so that that data can be used in a predictable, efficient manner.

You're probably familiar with an array. If you want to group 10 integers, put them in a size 10 array.

But arrays are far form the only way of grouping and structuring data. Linked Lists, Graphs, Trees, these are all ways of grouping data. Many algorithms are designed to work on a specific data structure. Some programming languages have some of these built-in, others do not, but to get a better understanding on what's going on it really is best to take the time to code your own implementation of each of them, at least once.

More Information

Wikipedia

## Abtract Data Type

An Abstract Data Type is a data structure that is defined by the operations that can be performed on it. Rather then rigidly defined data strcture, you can think of them as a set of criteria for the types of operations, not a criteria on the actual implementation of those operations.

An int (integer) isn't an Abstract Data Type. A Sack is. Specifically a stack is an ADT that is defined as a Last-In-First-Out data structure, where you pile things on top of the stack with a *push* command and remove items from the top with a *pop* command. Many data strctures, such as an array or a linked list, can be used to implement a stack.

More information:

Video 1

Wikipedia

## Algorithms

Think of an Algorithm as a set of directions to get from point A to point B. There may be a bunch of different ways to get from point A to point B, and you want to choose the best one. In the same way there may be a variety of different ways to perform a computing task, and the idea is that you want to choose the best way, the best Algorithm, to do so.

Algorithms are also, in theory, programming language independant. In that they are designed so that they can be implemented in any programming language. Sometimes it's easier, or takes a lot less code, to implement an algorithm in one langauge as opposed to the other, but usually there isn't the case where it CAN'T be implemented in a language. Going back to the set of directions analogy, you can think of programming languages as types of cars. A set of directions that involves a lot of straight freeyways with no traffic might be a great fit for a fast sports car, while a set of directions that involves a windy dirty road might be better served by an SUV. But that isn't to say you CAN'T use an SUV on the freeway, or drive your sports car on a dirt road (scratches and dirt aside).

This isn't a perfect analogy, but it's how I think about it in my head and it's worked so far.

## Algorithmic Analysis

One could say this is more of an advanced topic, maybe even an unecessary one, but it's definitely an interesting one. As a programmer you will be concerned with whether one algorithm is faster then another. How that is determined is by analyzing the algorithm. I wil tell you how fast and efficient the algorithms in this book are compared to similar ones. I will also go into detail as to why. If all you care about is whether A or B is faster, then you can probably skip over these sections, maybe come back to them later.

One thing you do need to know regardless is how the running time of an Algorithm is expressed. There are two popular ways, Big Oh notation and Tilde notation, but they both pretty much have the same idea.

Say you have an Algorithm to brute force search for duplicates on an Array of N objects. That is to say you manually compare every object to every other object. The running time of this algorithm will theoretically be something like:

**c(N ^{2}) + z** - Where

*c*is the amount of time it takes the computer to perform each calculation, and

*z*is some arbitrary inital setup time.

The first thing we can do is completely disregard *c* and *z*. Why? Because we can't control them. Both of them are completely dependant on things like the speed of the computer the program is being run on, the programming language used, how well the programmer himself coded up the algorithm, etc. Things that, as an Algorithm Designer, are completely out of your hands.

The next thing that we can say is we only care about the highest order term. Say you have two algorithms. Algorithm A has a running time (after stripping away the constants) of **N ^{2} + N** and Algorithm B has a running time of

**N**. Objectively speaking the first one is faster. For small inputs, noticeably faster. If you're only working with 10 items, that is to say N = 10 then, Algorithm A has a running time proportional to

^{2}+ 10NA = 10

^{2}+10 = 110

While B is proportional to

B = 10

^{2}+ 100 = 200, or almost twice as long.

But we don't care about small inputs. For small inputs the operation is going to be super fast regardless of what algorithm you use. In fact when you're only dealing with small inputs, usually it's just easier to code up a 'slower' but simpler algorithm then one that runs faster, but takes 4x as many lines of code. What we care about is large inputs. Let's revist the previous algorithms, but change the input to 10,000 and now we have

A = 100,000,000 + 10,000 = 100,010,000

B = 100,000,000 + 100,000 = 100,100,000

Still a difference, but not nearly as much, and this difference will only decrease at the input gets larger. So we say that both algorithms have the same proprtional running time of N^{2}, which is expressed as either ~N^{2} or O(N^{2}). The O() notation, prnounce Big Oh, comes from Calculus and it's analysis of the growth of functions, so I won't go into all the details of it here. All you need to know is that O(N^{2}) is much faster then O(N^{4}).

In this text we will use the ~N^{2} notation, but if you look at other algorithm resrouces and see the O() notation, just remember it basically means the same thing for our purposes.

## Conventions

For each Data Structure presented in this text I have it's:

**Idea** - The concept behind it, it's pros and cons, and why you might use it.

**Implementation** - A pseudocode, carefully documented implementation of it.

**Examples** - Some language specific examples of them. As I learn more languages, this area will be expanded upon.

For each Algorithm in this text I have it's:

**Idea** - The idea behind the algorithm, when you would use it, why you would use it.

**Implementation** - A pseudocode, carefully documented implementation of it.

**Analysis** - A more in-depth analysis of it's runtime and efficiency.

**Improvements** - Some ways the algorithm may be improved beyond it's standard implementation.

**Examples** - Some language specific examples of them. As I learn more languages, this area will be expanded upon.

## Sources

Most of the information in this text comes from these textbooks:

Introduction to Algorithms, 3rd Edition

Algorithms, 4th Edition

An Introduction to the Analysis of Algorithms, 2nd Edition

I have taken courses on Algorithms, both through my Alma Mater UAT and Coursera, though they all used those textbooks for reference. In fact you can think of this text as my verbose notes from those courses, and if you find any of this interesting I highly recommend you check out those other textbooks as well, as I have every intention of re-reading them periodically until I can recite them from memory.

Of course the idea is that this will hopefully be an ever-evolving textm and over time becaome a much more de-facto reference for everything anyone would want to know about Algorithms and Data Structures.

Also Googling. Lots and lots...and lots...of Googling.