Wednesday, March 22, 2017

Sublime Selection

Select a character

SHIFT + LEFT / RIGHT

Select from cursor to end of word

SHIFT + OPTION + RIGHT

Select from cursor to beginning of word

SHIFT + OPTION + LEFT

Select a word

CMD + D

Select a line

CMD + L

Select between parenthesis

CTRL + SHIFT + SPACE

Select everything in file

CMD + A

Sublime Cursor Motion

Here's basic cursor motion commands you should know.

Move by character

UP / DOWN / LEFT / RIGHT

Move by word 

OPTION - LEFT / RIGHT

Move to the end of the line

CMD + RIGHT

Move to the beginning of the line

CMD + LEFT

Move to a matching brace

CTRL + M

Move to the top of the file

CMD + UP

Move to the bottom of the file

CMD + DOWN

Go to a line

CTRL + G (preferred) or CMD + T + :

Go to a symbol

CMD + R

Touching Typing Numbers and Special Keys

When I learned to touch type, I learned how to type letter properly but never properly learned numbers and special characters.

Here's how the keys map to fingers:

left hand
1 - pinky
2 - ring
3 - mid
4 - index
5 - index

right hand
6 - index
7 - index
8 - mid
9 - right ring
0 - pinky

tab - left pinky
caps lock - left pinky
shift - left pinky
fn - left pinky
control - left ring finger
option / alt - left mid
command - left thumb

http://apple.stackexchange.com/questions/47293/what-fingers-do-i-need-to-use-for-hitting-control-option-and-command-buttons-on
http://www.typing-lessons.org/preliminaries_4.html

Tuesday, March 21, 2017

Database Indexes

Premature optimization is the root of all evil.

Database indexes are all about optimization. Using indexes prematurely is unnecessary in most cases. However, knowing what it is and how to use it will save your ass. 

So lets talk indexing.

When you do things to some set of data in a relational database, the db has to retrieve that data. And if that data is retrieved based on some value (get every row where column C equals 5), then the db has to go through every single row in a table containing that data and decide whether or not that row should be used. 

Unless you use indexes. 

If you use indexes, the db does NOT have to go through every single row. Instead, it will find the rows it needs by using a special data structure. That special data structure is the index. 

Most database indexes are B-Trees which allow for logarithmic time operations. In other words, it can change O(N) lookup to O(log n). If you have a lot of data, that can be a huge difference. Instead of looking at every row and checking if a specific column equals a certain value. If there's an index for that column, then the database can do a O(logn) time look up for the value in the B-Tree, then follow the pointer to the row! 

Using an index is not always an optimization.

There are times when using an index can actually hurt you. An index is just used to do a look up for a row, but if a lot of those rows are being returned, then all an index adds is extra overhead for the lookups because when the matching rows are found using the index lookup, they still need to be scanned!


Monday, March 20, 2017

Relational Databases

So there's data that we want to store, access, and manipulate. Relational databases are tools that enable us to do just that.

With relational databases, you have to define the structure of the data before you can do anything with it. This structure is expressed as tables that have columns. And those columns represent a specific type of data (numerical, date, text, etc).

Tables are the types of things that you have information about. Columns are information about those types of things. Rows are information about the actual things. For example, if I'm interested in working with medical data and I need a database of patient (a type of thing) information, I might have a table called patients and the patients have information like name, and social security number, and insurance.

The blueprint of a database, which is just a set of table definitions, is called a schema.

Here's an example:

Doctor (
   name CHAR(20)
   id PRIMARY KEY
)

Patient (
   name CHAR(20)
   social_security_number INTEGER(9) PRIMARY KEY
   doctor_id FOREIGN KEY REFERENCES doctor
)

Tables can also have keys which are a special type of information that is unique to every row in the table. In the case of patients, that might be their social security number since no two people will have the same social security number.

Using SQL to manipulate data in a relational DB

Now for this database to be us useful, it needs to contain data. We can create a db and manipulate data in that db by using SQL (structured query language) which is a language that you can use to define what you want to do with that data that the database understands.

> CREATE DATABASE example_hospital;

Adding a doctor

> INSERT INTO Doctor VALUES("Dr. Phil")

Get all the doctors!

> SELECT * FROM Doctor

Get all the names of all of the patients of Doctor with name "Dr. Phil" (this involves data from more than one table!)
To select data in multiple tables that are related, we have to join them by running JOIN statements.

> SELECT patient.name FROM Patient, Doctor WHERE patient.doctor_id = doctor.id AND doctor.name="Dr. Phil"

This is known as the inner join. There are no rows with a key value in a table that does not match up with the key value of of another row in the corresponding table. In other words, it excludes rows from both tables that do not link up.

Left outer - retain all the rows in the left table but exclude rows on the right if they don't match. The values in those rows are replace with the value NULL. This would include all the patients even if they don't have a doctor seeing them.

Right outer - retain all the rows in the right table. This would include all the doctors even if they don't have patients.

Full outer - retain rows in both tables! This would include all the patients and doctors.

Get the number of patients

> SELECT count(*) from Patient

Get the number of patients per doctor!

> SELECT doctor.name, count(patient.id) as num_patients FROM Patient, Doctor WHERE patient.doctor_id = doctor.id AND doctor.name="Dr. Phil" GROUP BY doctor.name

Maintaining Data Consistency

Databases also prevent you from trying to do things that it thinks is nonsensical. For example, if you say that a patient has a doctor and you insert a bunch of patients that reference doctors. Then you can't just delete doctors because then there will be foreign keys in the patients table that point to nothing. That violates referential integrity (which says that every row in a table with a foreign key must have that key point to an actual row in another table).

DB's also support features like transactions, where you can specify a series of operations that are treated as atomic. The changes only persist if every operation succeeds. Otherwise, no changes persist.


Sunday, March 19, 2017

Jalapeno Popper Sandwich with Bacon


Ingredients
  • bacon strips
  • sliced bread
  • jalapeno peppers
  • cream cheese 
  • shredded cheddar cheese
Instructions

Pepper prep
  1. cut the jalapeno peppers in half
  2. broil the peppers
    1. preheat to 400 and let it cook for 10-15 minutes until slightly charred
  3. let it cool for 10 - 15 minutes because you'll need to touch them next
  4. remove the skin and seeds
Cheese prep
  1. mix the cheese in a bowl
  2. or not. up to you.
Bacon prep
  1. if frozen, put it in the fridge and let it thaw
  2. bake the bacon
    1. preheat to 450 and let it cook for 15 minutes. longer if you like crispier bacon.
Bread prep
  1. toast the bread!
Assembly
  1. Put the cheese on the bread. Then the peppers. Then the bacon. 
Microwave for 1 - 2 minutes (to melt the cheese into everything else mmmmm) and serve :) 


Saturday, March 18, 2017

Migrating a wordpress.com site will take you longer than thirty minutes

A couple of weeks ago my girlfriend told me that she wanted to install Google Analytics on her wordpress.com blog. Unfortunately, you can't install analytics without upgrading to the business plan which is a whopping $24.92 a month. She had plans to monetize her site in the future (ads?), which isn't possible with the personal plan.

So I told her that she should consider a self-hosted site. In fact, it's so easy I could do it for her in less than an hour.

I just finished.

It took over a week.

Here's what I had to do:

  1. Create an account on a new hosting service 
  2. Install wordpress through cPanel using a temporary domain name
  3. Import content from old wordpress into new wordpress
  4. Install Jetpack after realizing that none of the shortcodes were working
  5. Install Google Analytics (FREE)
  6. Transfer current domain name registrar over to new registrar (This process takes FIVE days)
  7. Buy a SSL certificate (got it for a dollar/year thanks to discount holla)
  8. Update DNS nameservers for domain (Takes about a day to propagate) to point to new host
  9. Update hosting plan to use current domain name and turn on SSL
  10. Replace WP database references to temporary domain using 
  11. Optimize the site with the help of Google PageSpeed because the site was slow AF. Went from a page speed score of 40 (ah!) to 96 (yay) by using a combination of plugins:
    1. autoptimize
    2. wp super cache
    3. wp asset cleanup 
    4. speed up javascript to footer
    5. speed up optimize css delivery 
Finding the right combination of plugins was a bit of trial and error. Some claimed to work but didn't. So I had to keep inspecting the page source and running page speed to test whether or not the plugins were making any difference to the site performance. 


Done! 

Design Patterns

Humans are great at seeing patterns. When a programmer solves a lot of problems, he starts to see solutions that seem generally applicable to a wide category of problems. When those problems are design problems (how to manage the objects and the relationship between objects), those solutions are known as "design patterns".

There are different types of patterns:
  • Creational
    • ways to manage the creation of objects
  • Behavioral 
    • ways to manage the communication of objects
  • Structural 
    • ways to organize objects
A common structural pattern is known as the decorator or wrapper pattern. You use it by wrapping an object with another object that has the same interface which then modifies the behavior of the inner object. You're basically creating an onion :)

Decorator pattern in the wild

If you've ever build a rails application, you've probably run into situations where the controller is filled with code responsible for formatting your data for presentation. There's a library called draper that allows you to write decorator models that wrap your models with functions that will do the formatting, which removes an additional responsibility from your controllers.

Why not just use inheritance?

Less flexible. With decorators, you can decorate your objects at run time by wrapping them with other objects. With inheritance, you lose that flexibility.


Object Oriented Programming

OOP is just one approach to creating computing abstractions that's very popular because it's based on how we already think and perceive. We see the world in terms of things (objects) doing things (methods), and thus a programming language that allows us to define computing processes using that way of thinking feels much more familiar than, say, writing ones and zeros. OOP languages are all just variants of how to formally define those abstractions to a computer. 

When I learned about object oriented programming in college using Java, I first learned a bunch of weird buzzwords like encapsulation, inheritance, polymorphism and that you need to make classes of things before they can do things. Oh and that in order to share the same set of behaviors in one class in the definition of another you have to use the "extends" keyword. So boring.  

This kind of introduction to OOP that starts with the unique terms of OO and language specific keywords makes it harder to grasp the essence of the why behind OOP. Students don't need to know the difference between an abstract class or an interface to "get" OOP. 

Regardless of what OO language you use, you'll always be dealing with objects and data structures. They're both data, but objects are a higher level abstraction that also comes with actions whereas data structures is just pure data. All the other terms surrounding OO that are language specific are just ways that that language allows you to define the behavior of those objects and their relationships with other objects. 

Concepts like "polymorphism" and "encapsulation" are just things that are made easier by OO programming, but are not exclusive to OO programming languages. I think it's much more effective to teach how to model processes with OO first, then introduce more specialized concepts once the basic big picture understanding is cemented. Things like "encapsulation" starts to make perfect sense once you start seeing the benefits of hiding the details of how a thing does something. 


Friday, March 17, 2017

Concurrency

Why do one thing at a time if you can do many things at a time? That's what multi-core processors allow computers to do - do many things in parallel. Even in single processors, from the users perspective it still seems as if things are running in parallel because operating systems are so damn good at context switching.

A thread is the fundamental unit of execution in a computer. It starts somewhere and ends somewhere. Once you introduce multiple threads of executions, you can open up a whole can of synchronization problems if you don't synchronize your threads!

For example, lets say Sally and Bob share a bank account. The way the system works is that there is a number of ATM machines in different locations that are connected to the central banking system. When someone initiates a transaction, a new thread of execution is initiated in the program that runs in the central system.

In this program, lets say there are accounts (one of which is shared by Bob and Sally).

The code involved in updating the balance as a result of a deposit is as follows:

newBalance = userBalance - amount (the calculation)
userBalance = newBalance (the update)

Now lets say Bob and Sally both attempt to withdraw at the same time. Lets say Thread A is the thread initiated by Bob and Thread B is initiated by Sally.

1. Bob and Sally both initiate a withdrawal of $100 from a starting balance of $500.
2. Thread A calculates the new balance to be $400 and then gets suspended while Thread B is run. At this point, the user balance is still $500.
3. Thread B ALSO calculates the new balance to be $400 and then goes to completion.
4. Tread A finishes with balance of $400.

They both withdraw $100, but the final balance is $400 :)

Luckily, there are several ways to synchronize threads so that things like this don't happen. A common construct is a semaphore which basically protects shared resources. So if one was used in this situation, it would be used to protect the account resource. If someone is already using the account to withdraw or deposit, don't let anyone (any other thread) access it.

How should you sort in this situation?

A master directory server receives a list of accounts, ordered by user ID, from each of several departmental directory servers. What's the best approach for this server to create a master list combining all the accounts ordered by user ID?

Questions you should ask:

  1. Are the list of accounts from each individual server already sorted?
  2. Can all the id's fit in memory?
Scenarios:
  • Individual lists are not sorted 
    • extra memory available
      • You can pretty much just pick a sorting algorithm based on speed. Quick sort will do.
    • extra memory not available
      • You're still fine with most sorting algorithms as long as they're in-place (such as quicksort or selection sort). 
  • Individual list are sorted
    • extra memory available
      • Merge sort can be very efficient here since the merge operation is O(n). Since we have extra memory, the auxiliary memory it needs may not be an issue. 
    • extra memory not available
      • You can still use merge sort if you lazy load the sublists. So instead of loading O(N) records in memory that will require an additional O(N) extra memory for the temporary buffer, you'll just have the O(N) temporary buffer and read the values of the sublists as you need them from either the disk or server. 
As you can see, there's no such thing as the"best" general sorting algorithm because it depends on what the constraints are!

Sorting Algorithms

There's only two reasons to sort a list:

1. So that you can present it in a way that's more readable. Say, a list of state names in alphabetical order.
2. Increase the efficiency of things you do to that list. For example, retrieving and removing the max value element in the list over and over (for some reason...) 

There's a lot of ways to achieve both of those goals. And they all force you to understand the property of the list you want to sort and the characteristics of the sorting algorithm you use. 
  • How much data are you trying to sort? 
  • Is that data partially sorted?
  • Can all that data fit in memory? 
  • Does the algorithm you use maintain the relative position of equal values? (stability) 
  • Does the algorithm require extra memory?
If you're a programmer, you should at least know the basic properties of some of the most common sorting algorithms such as ...
  • Bubble sort (O(n^2))
    • Go through the list looking at pairs of elements. If the left one is less than the right, swap their positions.
  • Selection sort (O(n^2))
    • Start at the first element. Now find the minimum and swap the minimum with the first element and then move on to the second element and find the minimum from the second element to the last and swap. 
  • Insertion sort (O(n^2))
    • Start at the second element and check if it's smaller than the first. If so, insert it before the first (the first becomes the second). You now have a partially sorted list! Now, you move on to the third element and insert it in the right position in that partially sorted list and continue until you have a fully sorted list.  
  • Quick sort  (O(n log n))
    • Pick a pivot and swap the elements so that all the numbers smaller than or equal to the pivot is on the left and those larger are on the right. Do this recursively!
  • Merge sort (O(n log n))
    • Split the list into individual elements, then merge them together (first by pairs) in order. The merge algorithm works by creating a new list and inserting the correct elements into the lists through comparisons of the two lists. 
Most general purpose languages include sorting functions in their core libraries so you shouldn't ever have to write one yourself :)

Telephone Words

Write a function that takes a seven-digit telephone number and prints out all of the possible "words" or combinations of letters that can represent the give number. 

Recursive Solution
  • If we've passed 7 digits, print out our current number. Other wise, loop through each letter associated with the current digit, save the current digit in position, and then recurse.
If you want to instead return the values instead of merely printing them, you would want to maintain a running list of all the numbers that grows as the functions "unwind". 

Iterative Solutions
  1. Just use 7 for loops :)
for ...
    for ...
        for ...
            :(
  1. If you write out a bunch of combinations for a number by going from left to right, you'll notice that as the last digit cycles, the digit to its left also cycles. 
For example, lets say we're only dealing with three sets of characters: [[a,b,c],[d,e,f],[g,h,j]]. Our pattern will be as follows:

a,d,g
a,d,h
a,d,j
a,e,g
a,e,h
a,e,j
a,f,g
a,f,h
a,f,j
b,d,g
...

As you can see, after we go through [g,h,j] at the end once, the previous numbers letter changes from a higher value to a lower value (d to e). We can take advantage of this simply by starting out with a word, say "a,d,g". Then we'll change "g" to "h" and print. Then change "h" to "j" and print. And then since we've gone through a full cycle, we'll reset it back to zero and then change our neighbor to be a higher value ("e"). The trick here is updating the counters for each position correctly (like when [d,e,f] is cycled through, its neighbor is updated. 

Sunday, March 12, 2017

Recursion

"If you already know what recursion is, just remember the answer. Otherwise, find someone who is standing closer to Douglas Hofstadter than you are; then ask him or her what recursion is."                                                                                                                                                                                                                                 - Andrew Plotkin
Informally defined, recursion is simply the process of repeating self similar elements. The easiest way to visualize a recursive event is by standing in front of a mirror with a mirror. You’ll see an infinite repetition of the same reflection.

In math and computing, recursion is more formally defined.

In mathematics, recursion is a definition of a function in which the application of the function is in its definition. A common example of this is the definition of the fibonacci sequence, which has two properties:

1. The base case or base cases(a non-recursive definition).
2. A set of rules that reduces all other cases to the base case(s).

In computation, a recursive function is one that invokes itself. In most programming languages, the definition of recursive functions end up closely resembling mathematical definitions of recursive functions.

Daylight Saving

To understand the "why" of daylight saving, we have to dig into the past.

In ancient times, the start and end of labor was determined by the rising and setting of the sun. The notion of having to abide by a specific time schedule is a product of the industrial revolution, when manufacturing factory owners began enforcing time schedules. Be at work at 7AM. Eat at 12PM. Go home at 8PM.

As people of industrial society, we’re accustomed to living by a strict time-based schedule. Nowadays, the starting and stopping in the operation of nearly all of our institutions are based on fixed, precise measurements of time. A store opens at an agreed upon moment in time and is opened for some measure of time. Since so many of our activities are based on some agreed upon measure of time, what if we change the time? What effect will that have on the activities?

In the mid-19th century, a man by the name of George Vernon Hudson pondered those exact questions that led to the implementation of "daylight savings". Hudson was working a shift job that gave him leisure time to collect insects, and led him to value after-hours daylight.

George had later shifts (not uncommon amongst industrial workers), so he wished for the sun to still be out by the time his boss let him off! But if his boss lets him off really late, then he wouldn’t be able to collect insects. Of course, he could always ask his boss to let him out earlier, but most factory owners weren’t that nice - and they certainly weren’t going to change their hours of operation just for a bunch of insect collectors!

Now if your boss won't let you off early, who should you appeal to? Well, Hudson went to the government. If the government moved the time forward, he will be able to get off work early and enjoy the sun! And this concept of moving the time forward to enjoy more sun in the evening is what is now called Daylight Saving.

Unfortunately for George, the government laughed at him. And when this idea of changing the time was finally enacted in 1916, he was already dead.

Sunday, March 5, 2017

Learning about the Y-Combinator

Mike Vanier wrote an awesome article on the Y-Combinator function that I've been working my way through. It's a fascinating read. Here are my notes on it so far.

  • An explicit recursive definition is a recursive function definition whose body contain the name of the recursive function. For example, (define (hello a) (hello a)).
  • It's possible to generate the recursive version of a function without using an explicit recursive definition by use of higher-order functions. Holy shit, right? 
  • The Y-Combinator is one such higher-order function that can generate a recursive version of a function by using it's non-recursive function. 
  • In functional programming, you can create the non-recursive version of a function by abstracting out the recursive call. So instead of referencing itself, it will reference a function that's to be passed in as an argument by a higher order function. Common technique in FP.
  • Pass an identity function as an argument to the non-recursive function to get the factorial of all numbers up to 0. For example, (define factorial-zero (almost-factorial identity)). This will fail for N > 0. However, it will succeed if you pass factorial-zero as an argument to factorial-one! This can be proven although why it works is still a mystery to me. 
  • By performing the above process infinity times, you can get the factorial of all numbers up to infinity. The function that can do that for all numbers up to infinity is the factorial function! But how the heck do you define this chain of functions up to infinity?
  • To be continued ...