jBoxer

I change the directions of small pieces of metal for a living.

Little Joys of Rails (Part 1) - Renaming Associations

| Comments

I’ve spent the last month or so teaching myself the basics of Ruby on Rails. By far my favorite thing about it is that, at every turn, I find a little feature or helper that works really well and does exactly what I want without feeling messy or “magical”.

Unfortunately, I have no one to share these little joys with, as pretty much none of my programmer friends know Ruby or Ruby on Rails. So, I figured I’d start sharing them on here. Hopefully, Rails newbies (like myself) will learn something, and Rails veterans will be able to chuckle knowingly as they fondly look back on the glory days of learning Rails.

Renaming Associations

If you have one model that belongs_to another, the association (in both directions) is named after the models. For example, if I have Users and Courses, and I want a course to belong_to its teacher (who is a User), I would start with the following:

1
2
3
4
5
6
7
class User < ActiveRecord::Base
  has_many :courses
end

class Course < ActiveRecord::Base
  belongs_to :user
end

This is nice and simple, but if I want to access a course’s teacher, I have to do it like this:

1
2
my_course = Course.first
puts my_course.user # Print out the course's teacher

That’s a little confusing. What if my course also has_many students (also Users)?

1
2
my_course = Course.first
puts my_course.users.first # Print out the course's first student

So, my_course.user is the teacher, and my_course.users are the students. There’s nothing to indicate that, I just have to know it. Wouldn’t it be nicer if the associations could be named for what they represent?

1
2
3
class Course < ActiveRecord::Base
  belongs_to :teacher, :class_name => 'User'
end

To make this work properly, the foreign key needs to be teacher_id instead of user_id (that can be overridden too, but I won’t go into that right now). Then, you get to do this:

1
2
my_course = Course.first
puts my_course.teacher # Print out the course's teacher

Much better!

Now, I recognize that this is far from a revolutionary feature. I’m pretty sure it’s possible in Django and most other MVC frameworks. I was just very happy to bump into it.

Installing MySQL on Snow Leopard

| Comments

Early on at Cisco, I realized that it’s really beneficial to write down (step by step) what I’ve done when I install something new. It forces me to think about what I’m doing, it provides me with a guide in case I mess up and have to start over, and other people can benefit from it later.

With that in mind, I’ve decided to document my installation of MySQL on Snow Leopard (OS X 10.6). Hopefully, someone will get some use out of it, but if not, at least I’ll have documentation of what I did.

The PATH Variable

MySQL has a bunch of useful executables that aren’t in PATH by default. I could symlink to them, but I think that’s less maintainable than just appending to PATH.

  1. Open the Terminal

  2. Open up the .bash_profile file in your home directory. This is a bash script that runs every time you start the Terminal, so the PATH variable will be extended properly every time. If you have TextMate and its UNIX command line tools installed, do it like this:

    bash mate ~/.bash_profile

    Otherwise, do it like this:

    bash /Applications/TextEdit.app/Contents/MacOS/TextEdit ~/.bash_profile &

    If you use the second one, make sure to add the ampersand at the end.

  3. We want to add MySQL’s bin folder to PATH. MySQL’s going to be installed at /usr/local/mysql, so let’s add /usr/local/mysql/bin to PATH. Add the following line to .bash_profile:

    bash export PATH="/usr/local/mysql/bin:/usr/local/sbin:$PATH"

    This replaces PATH with two new locations and the old value of path (so essentially, appending the two new locations to the beginning of PATH). You’ll notice that, in addition to /usr/local/mysql/bin (which I mentioned earlier), I also added /usr/local/sbin. OS X doesn’t include this in PATH by default, but I think it should, so I added it. I have no defense for that position, but this is as much a guide for myself as it is a guide for other people, so I don’t need to defend it :)

  4. Save the updated .bash_profile and close it.

  5. Quit and reopen Terminal so that .bash_profile will be rerun (you could also run it explicitly, but I prefer quitting and reopening).

  6. Run the following command:

    bash echo $PATH

    You should see /usr/local/mysql/bin and /usr/local/sbin in there now.

  7. Edit: Restart your computer. My installation worked without doing this, but Jon (in the comments) said he needed to do this before it worked, so you might as well.

Downloading + Installing

I might try compiling and installing from source at some point, but I don’t see any reason to when there are official OS X binaries available. Worst case scenario, it messes with a bunch of my settings, and I’ll uninstall and then do it from source.

  1. Go to http://dev.mysql.com/downloads/mysql/5.1.html#downloads
  2. Scroll down to the Mac OS X section (near the bottom). Download the version labeled “Mac OS X 10.5 (x86_64)”. It’s very important that you download the 64-bit version. The 32-bit one will give you preference pane problems (I call them PPPs whenever I talk about them, though this is the first time I ever have). Eventually, there’ll probably be a Mac OS X 10.6 version, but for now, this works perfectly.
  3. Open the .dmg, then run mysql-x.x.xx-osx10.5-x86_64.pkg. Continue through all the screens without changing anything, until it finishes.
  4. Optional: install the preference pane by running MySQL.prefPane.
  5. Optional: make MySQL start up along with OS X by running MySQLStartupItem.pkg. I didn’t do this (I often use my computer for things other than development, and I don’t want to bog it down unnecessarily), so I can’t provide any instruction or vouch for how well it works on Snow Leopard.

You should now be able to start MySQL by going into the MySQL preference pane and clicking “Start MySQL Server”. If it doesn’t work, leave me a comment, and I’ll try to help you.

Django middleware vs. context processors

| Comments

This may be old news to many people, but it’s something I just recently learned (after doing it wrong for about nine months), so I figured someone else may be able to benefit from my mistakes.

For a long time, when I needed to access the currently logged-in user in one of my Django views, they would follow this pattern:

1
2
3
4
5
6
7
8
9
10
11
12
13
from django.template import RequestContext

# some other view code

def my_view (request, obj_id):
  context = RequestContext(request)
  obj     = get_object_or_404(SomeModel, pk=int(obj_id))

  # do some stuff, including:
  some_function(context["user"])

  return render_to_response("some_url_name", {"some_var": some_val},
    context_instance=context)

Now, seasoned Django vets (why are you reading this post, by the way?) are probably laughing, but this seemed perfectly fine to me. Luckily, my ignorance was revealed to me while asking on IRC about a problem which, while didn’t seem so at the time, was completely tied to my misuse of RequestContext.

To put it simply, context processors are made to be used in templates. The only time they should ever be instantiated is at the very end of a view, like so:

1
2
return render_to_response("some_url_name", {"some_var": some_val},
  context_instance=RequestContext(request))

I was using them all over views, and even in a few of my decorators. What’s wrong with this? The main thing is, certain mutation functions are sometimes performed when a RequestContext is instantiated (such as user.get_and_delete_messages(), which gets all of a user’s messages from the database and then deletes them), and performing them multiple times before loading the template can cause unexpected results. In my example, I was instantiating a RequestContext in a bunch of decorators, which meant that by the time my template was loaded, all of the user’s messages were deleted (and stored in an earlier instance of RequestContext), making it look to me as if my auth messages were being thrown out.

What’s the solution? Middleware. Middleware allows the programmer to attach variables to the request object before it reaches the view. With this (and the provided django.contrib.auth.middleware.AuthenticationMiddleware), I can still achieve my original goal (accessing the logged-in user from a view), but now I can do it without creating a RequestContext and potentially running mutation functions multiple times:

1
2
3
4
5
6
7
8
9
10
11
12
from django.template import RequestContext

# some other view code

def my_view (request, obj_id):
  obj = get_object_or_404(SomeModel, pk=int(obj_id))

  # do some stuff, including:
  some_function(request.user)

  return render_to_response("some_url_name", {"some_var": some_val},
    context_instance=RequestContext(request))

I’ve removed all the references to RequestContext from my decorators, and made sure to only instantiate it at the very end of views. Now, messages work perfectly. I’ve even written my own middleware (which you can learn how to do here) to load instances of a few of my own models into the request.

To reiterate, I’m aware that this is common knowledge for many people, but it took me 9 months of moderate Django use and embarrassment on IRC to discover it for myself. Hopefully, this will help someone else in a similar position.

Are you a better programmer than you were two years ago?

| Comments

Two years ago, I was working on a fairly complicated (for my level of experience) web app for posting news articles. Quite a few times, I ran into situations where I was about to create tight coupling between two somewhat unrelated parts of my app. Sometimes, I wouldn’t even recognize this as a problem. Other times, I would, but not be able to think of an easy fix, so I’d continue on.

Today, I find myself fixing tightly-coupled situations almost instinctively. I use design patterns that I was barely aware of two years ago, and I do it without straying into “design pattern fever” territory. This isn’t to say I’m an expert at software design; it still takes a lot of thought to actually think of the best fix, and I’m sure I still make plenty of mistakes. My point is, my growth as a programmer is very obvious to me in these situations.

How about you? If you’ve been programming for more than two years, you’re probably a better programmer than you were two years ago, but can you tell? If so, how can you tell? Can you give any good examples of moments when you realized it?

Non-painful email on Django development servers

| Comments

I’ve been actively learning and using Django since August 2008, and I’ve loved almost every bit of it. There are plenty of places to read all about the virtues of Django, so I’ll leave that out for now.

One thing that’s always bugged me about web development in general is the sending of emails. I do development on my local computer (with a badly set up Apache / MySQL / PHP / Python / whatever else stack), and I’ve never felt like dealing with the headache of setting up a mail server. This means, when I add something that’s supposed to send an email (like an activation email after registration), I have to get very hacky to test and debug it (making sure the email text is being produced correctly, making sure it’s being sent to and from the right people, etc.).

This was one of the few web development pains that I thought Django didn’t solve. Whenever I’d test a bit of code that was supposed to send email, I’d get a “Connection refused” error page (meaning my computer has no mail server to send the email with). I would usually add in a bit of printf debugging to make sure the subject and body had the correct text, but beyond that, I’d usually wait to test the email portions until I uploaded to a server that could send email (usually the production server, unfortunately).

Yesterday, I bumped into a little section in the Django documentation that explains how to get around this. As usual, Python has all the solutions. First, set this code in your settings.py file:

1
2
EMAIL_HOST = 'localhost'
EMAIL_PORT = 1025 # replace this with some free port number on your machine

Then, assuming you’re on a Unix system (I’m on a Mac), run the following on the command line to start a “dumb” Python mailserver:

1
python -m smtpd -n -c DebuggingServer localhost:1025

Make sure to replace 1025 with whatever you filled in for EMAIL_PORT.

Now, try running the email-sending code in your Python application. Voila! No error pages (or at least, none related to email), and the full text of the email (headers and all) appears in whatever command line prompt you ran the dumb mailserver on. This allows you to the see senders, recipients, subject, and body of the email being sent out, all without getting hacky or sending to an email account you own.

Taking this a step further, I created a small bash script called “dumbmail” in /usr/local/bin that looks like the following:

1
2
3
4
5
6
7
8
#!/usr/bin/env bash
if [ -z $1 ]
then port=1025
else port=$1
fi

echo "Starting dumb mail server on localhost:$port"
python -m smtpd -n -c DebuggingServer localhost:$port

Now, when I’m testing a Django application and I get to a section that is going to send an email, I just run “dumbmail” (or “dumbmail some_number” if I need to use a different port, for some reason I can’t imagine), and I’m ready to go.

Hope this helps people. The documentation was always there - I just never noticed that part until yesterday.

Round Rectangles (or Why Steve Jobs Is a Visionary)

| Comments

A story was posted on Folklore.org (a site full of old stories about the creation and initial development of Apple computers) about the addition of rounded rectangles to an old drawing program, and Steve Jobs’s involvement in them. This section in particular struck me:

Bill fired up his demo and it quickly filled the Lisa screen with randomly-sized ovals, faster than you thought was possible. But something was bothering Steve Jobs. “Well, circles and ovals are good, but how about drawing rectangles with rounded corners? Can we do that now, too?”

“No, there’s no way to do that. In fact it would be really hard to do, and I don’t think we really need it”. I think Bill was a little miffed that Steve wasn’t raving over the fast ovals and still wanted more.

Steve suddenly got more intense. “Rectangles with rounded corners are everywhere! Just look around this room!”. And sure enough, there were lots of them, like the whiteboard and some of the desks and tables. Then he pointed out the window. “And look outside, there’s even more, practically everywhere you look!”. He even persuaded Bill to take a quick walk around the block with him, pointing out every rectangle with rounded corners that he could find.

When Steve and Bill passed a no-parking sign with rounded corners, it did the trick. “OK, I give up”, Bill pleaded. “I’ll see if it’s as hard as I thought.” He went back home to work on it.

I think this is a fantastic example of what makes Steve Jobs one of the few true visionaries in the world. In the face of a big advancement like fast ovals (yes, it was definitely a big deal at the time), I would’ve been more than satisfied. I may have asked for rounded rectangles in a second iteration, but it would’ve been a simple feature idea. I believe most people would’ve reacted the same way.

What makes Steve Jobs special is his ability to quickly identify what’s really important. It seems so obvious after the fact. Of course people would like computers with translucent colored cases! Of course minimalist controls would make for a more accessible and desirable MP3 player! Of course rounded rectangles are an extremely common shape, and are really important to have in a drawing program! These ideas (and more) are all obvious now. But a year before Apple did them, other companies were struggling to innovate in these fields, and they were only “obvious” to Steve Jobs.

Of course, anyone can have a great idea that turns out to be the right way of doing things. This is why I distinguish between “true” and regular (false?) visionaries. People called M. Night Shyamalan a visionary after “The Sixth Sense” and “Unbreakable”. He then went on to make one more pretty good but decidedly un-visionary movie (Signs), an arguably good but decidedly un-visionary movie (The Village, which I loved, but I understand why others didn’t), and two duds (my apologies to the three people who liked them). The visionary label was applied to Shyamalan before he’d reached first base, and he got thrown out at second. This happens all the time, and these people are certainly not true visionaries.

True visionaries come up with visionary ideas so consistently, that it becomes expected of them. And no one (off the top of my head) has a more consistent history of this than Steve Jobs.

A response to “The Extreme Google Brain”

| Comments

A blogger named Joe Clark just made a post called The Extreme Google Brain. In it, he takes a side on the recent tiff between lead designer Douglas Bowman and Google, where the former left the latter out of frustration at having to prove every design decision with real-world test data.

Joe Clark whole-heartedly agrees with Bowman, as many others do (I myself am somewhat torn). However, I find Clark’s article to be a ridiculous rant, full of stereotyping, fact-inventing, name-calling, and other marks of awful opinion pieces.

One frequently-used tactic in his piece is the inventing of a fact, followed by a single related fact that’s supposed to prove it. Case in point:

Some of these boys and men exhibit extreme-male-brain tendencies, including an ability to focus obsessively for long periods of time, often on inanimate objects or abstractions (hence male domination of engineering and high-end law).

Yes, males dominate engineering and high-end law. However, the cause is the topic of endless debates, and yet Clark claims it’s due to “extreme-male-brain tendencies” like he read it in a science textbook. This “here’s a fact I made up (hence an already-known, tangentially-related fact)” pattern repeats itself a few times.

When he’s not making up facts, he’s stereotyping a group of tens of thousands of people based on the few he knows.

Apart from Bowman, I can think of only two Google employees I could stand to be around for longer than an elevator ride. My impression of “Googlers,” which I concede is based on little direct knowledge and is prejudicial on its face [note: apologizing in advance does not make it okay to say something idiotic], is one of undersocialized, uncultured, pampered, arrogant faux-savants who have cultivated an arrested adolescence that the Google working environment further nurtures. Their computer-programming skills, the sole skills valued by the company, camouflage the flaws of their neuroanatomy. Their brains are beautifully suited to the genteel eugenics program that is the Google hiring process but are broken for real-world use.

You get the picture. Throughout the rest of the article, he:

  • Contends that A/B testing has no value.
  • Makes up scenarios that he believes (and “speculate[s] that Bowman would not disagree”) accurately represent Google meetings.
  • Tells us that we can’t disagree with Bowman and still feel that technology juggernauts are becoming better at visual design.
  • Says that when a company uses anti-design (extremely minimal and not-necessarily-beautiful designs such as Google or Craigslist) and succeeds, they’re succeeding despite the anti-design. He then concedes that can’t prove it, but assures that if you’re “visually literate” (which Adobe defines as the “ability to construct meaning from visual images”, which anyone older than an infant can consistently do), you “just know it”.

There isn’t really much else to say about this. I’ve read quite a few articles that side with Bowman, and quite a few that side with Google, and many of them on each side were great articles with great points. This was not one of them. I’ve never heard of Joe Clark before, and based on this, I hope I never do again.

Cooking like my Mom

| Comments

Every single holiday since I can remember, my mom has made a dish called Lokshen Kugel. She makes it with egg noodles, cottage cheese, sour cream, and a bunch of sugar, vanilla, and cinnamon (as well as the standard eggs and stuff).

Today, after getting the recipe from her, I made it myself. The weirdest feeling was opening the oven halfway through. There was a dish I’d seen dozens of times before, but for the first time, it wasn’t in the oven at my house (the one I grew up in), and it was made by me instead of my mom. It was very surreal.

I haven’t tasted it yet. But when I do, I’ll post about it again. I also took some pictures, so I’ll be sure to put them up (once I find the cord for my camera).

How to export a 1024 X 768 screencast from Adobe Premiere Pro CS4 to Youtube

| Comments

Over spring break, I made some screencasts for the website I maintain for my job. We used a free, open-source screen recorder called CamStudio. Our plan was to upload them to YouTube and embed the videos into our site from there. The one problem: the Export Media dialog on Adobe Premiere Pro CS4 (the software I used to edit the screencasts) only has a setting for low-definition (320x240) videos. My screencasts were 1024x768, and shrinking them to 320x240 would make the video part pretty much useless.

So, I set about trying to find a setting that would work for a 1024x768 screencast. The majority of settings produced some weird-quality results, even with the quality settings turned up as high as possible, which makes me think it had something to do with a faulty interlacing/deinterlacing setting somewhere. When I changed the one interlacing setting I was able to find, the exported videos would work fine in QuickTime, but look completely messed up in VLC and, more importantly, YouTube.

Exporting to F4V actually did work in every media player I tried it in, but for some reason, YouTube was unable to convert it to a playable format.

Finally, after about 9 hours of fiddling with Adobe Premiere Pro’s Export Media dialog, I found a setting of acceptable quality that worked on YouTube. I doubt many other people will need this, but since I was unable to find any information about the problem I was having online, maybe I’ll save another person 9 hours of work. Here are the settings I used:

Export Settings

Format: QuickTime

Filters

No Changes

Video

  • Video Codec: H.264
  • Quality: 100
  • Width: 1,024
  • Height: 768
  • (Width and Height are unlinked)
  • Frame Rate: 30
  • Field Type: Lower First (this was the setting that deals with interlacing I believe)
  • Aspect: Square Pixels (1.0)
  • Render at Maximum Depth: Checked
  • Set Key Frame Distance: Unchecked
  • Optimize Stills: Checked
  • Frame Reordering: Unchecked
  • Set Bitrate: Unchecked

Audio

No Changes

Others

No Changes

I’m sure there are some optimizations to be made in these settings; things that are causing me to produce unnecessarily large files, things that make rendering slower, or things that, if I tweaked, would give me even better quality. However, after 9 hours, I don’t care; this works, so for now, I’m done. Hope this helps someone.

Benchmarking Python decimal vs. float

| Comments

I’m writing a web app that includes, among other things, a good amount of (rational) non-integer numbers. Whenever I’m in this situation, and I’m using a language that supports Decimals (as opposed to just floats and doubles), I always wonder which one I should use.

If you’re a programmer, you understand the difference and the dilemma. Floats/doubles are very fast, as all computers (built within the last 15 years) have hardware specifically made to deal with them. However, they’re not perfectly accurate; because of binary representation, numbers that we use a lot (like 1/10 or 0.1) cause the same problems that 1/3 (0.33333…) cause in base 10.

Decimals, on the other hand, are slow. They are handled entirely in software, and thus take hundreds of instructions to do things that would take less than 10 with floats/doubles. The upside is that they’re perfectly accurate; 0.1 is 0.1 is 0.1.

So the question becomes twofold:

  • Do I really need my numbers to be perfectly accurate?
  • How much slower are decimals than floats/doubles?

In my case, the accuracy would be nice, but not completely necessary. And thus, the latter question becomes important. I’m not writing a large application, and I don’t expect it to get too popular too quickly, so if the slowdown is only moderate, I’ll take the accuracy.

To learn what the slowdown was, I wrote two quick Python test programs:

Decimal test
1
2
3
4
5
6
7
from decimal import Decimal

a = 0

for i in range(0, 20000):
  a = Decimal('%d.%d' % (i, i))
  print(a)
Float test
1
2
3
4
5
6
7
8
from decimal import Decimal  # kept this in on the float version
                             # so they'd have the same overhead

a = 0

for i in range(0, 20000):
  a = float('%d.%d' % (i, i))
  print(a)

When I ran each of these with /usr/bin/time (which I just learned about a couple weeks ago, and has replaced counting seconds on my fingers as my favorite benchmarking tool), the decimal version took an average of about 1.5 seconds (over 10 runs), while the float version took an average of 0.5. Just to make sure no overhead was getting in the way, I upped the limit to 40000 and ran it again. Decimal took 3.0 seconds, float 1.0. I can now confidently say that Python floats are about 3x the speed of Python decimals.

Or are they? While this tests the creation and printing of decimals and floats, it doesn’t test mathematical operations. So, I wrote two more tests. I’m going to be doing a lot of division on these numbers, and that’s definitely the most expensive mathematical operation to compute, so I made sure to do it in the tests (along with some subtraction).

Decimal version
1
2
3
4
5
6
7
from decimal import Decimal

a = 0

for i in range(2, 20002):
  a = Decimal('%d.%d' % (i, i)) / Decimal('%d.%d' % (i - 1, i - 1))
  print(a)
Float version
1
2
3
4
5
6
7
from decimal import Decimal

a = 0

for i in range(2, 20002):
  a = float('%d.%d' % (i, i)) / float('%d.%d' % (i - 1, i - 1))
  print(a)

This time, the float version averaged about 0.6 seconds (1.15 with 40,000 iterations instead of 20,000), while the decimal version averaged over 11 seconds (23 with 40,000 iterations instead of 20,000). So while Python float creation and printing is merely 3x as fast as Python decimal creation and printing, Python float division is almost 20x as fast as Python decimal division.

So what did I choose? Decimals. In the context of these tests, the decimal slowdown may seem significant, but if I finished my app using decimals and profiled it, I can almost guarantee (based on the speeds here) that the bottleneck would not be decimal division performance. If I was running an app that was handling hundreds of simultaneous requests, I may consider switching (I may also spring for better hardware, but that’s a different topic). However, for my purpose, 1/20th the speed of floats is more than fast enough.

P.S. As my very late discovery of /usr/bin/time should suggest, I’m extremely new to benchmarking. If anyone has any suggestions for me, or criticisms of my method, please leave your thoughts. This is something I’d like to get better at.