elfs | Entries tagged with programming

To the nurses, I am the bed whisperer!

Unfortunately, Not like that. But it’s still cool.

During my recent hospitalization, I was stranded on a hospital bed. Since I had fainted due to blood loss, I wasn’t allowed to get out of bed without an attendent. They even had frickin’s lasers watching and if the lasers sensed both feet on the floor an alarm would go off.

Nurse’s panel

Worse, the beds were broken. But, as someone else said recently, “I have ADHD, an internet connection, excellent research skills and poor self-regulation.” I took photos of the nurse’s side panel and underside to identify the make and model, downloaded the manual, discovered that it was no help at all, but ultimately found MedWrench, where nursing home technicians exchange gripes, complaints, and potential solutions. They hate this particular bed, but it’s popular because it’s cheap. Quelle surprise.

At one point, the nurse was struggling to make the bed go up. “Let me,” I said. “I’m the bed whisperer.” I reached through the handle, pressed the Global Lock + All Lift Locks buttons simultaneously, then pressed the specific lift she was looking for (raising the head of the bed).

“You are the bed whisperer! How did you do that?” She looked incredulous.

I explained, and I said I didn’t know how long that hack would last before the bed failed entirely. MedWrench says after that, you have to dig a panel open and re-seat a chip on the control board; it’s poorly held in and falls out if you move the bed too often. The most approved technique is to drill an 8mm hole at a location shown in various images and push the chip back in with a pencil eraser.

A little later, a nurse was checking on my roommate. She shuffled over to me and said, “Can you show me how to unlock the bed. The other nurse said you know how.”

“I can’t unlock it if that’s not working. But I can show you how to override the lock while you’re using the bed.” I showed her the trick (Hmm, two in the front, one in the back… I should nickname it The Shocker!), she went to the other patient, I heard the whirring and she shouted, “It worked! Thank you!”

Pretty much by the end of the day the nurses had all taught that trick to each other.

As I said, I have excellent research skills. But I refuse to believe that an entire hospital full of professional nurses had no one with curiosity and mediocre research skills to relieve a serious, ongoing problem. It boggles my mind that no one at that hospital had done what I’d done– downloaded the manual for a thing they use every day, then looked further when the instructions in the manual didn’t work!

A friend of mine suggested I have a programmer’s mindset– we get libraries and toolkits with crappy documentation and are used to supporting each other by giving out tips and tricks when we ask. But surely that’s not unique to my profession? Are we the last bastion of problem solvers in our species? No, I don’t believe that at all. But what to make of the fact that a patient had to solve one of their most commonplace irritations?

Current Mood: bitchy
Current Location: The Villa Sternberg, isolating with Covid
Current Music: Mad Max Fury Road OST, Water

For reasons, I ended up trying to read the documentation on LINQ, Microsoft's new DSL for refining collection lookups, i.e. sorting and looking stuff up in large datasets. It's a lot like embedded SQL, only performant with regular language features like lists and lookup tables.

MSDN's (Microsoft Developer Network) LINQ documentation page has an "outline" on the left side. The feature I wanted, "LINQ to XML," was not in the outline. I found it down at the bottom of the document. I clicked on it, and followed the links down through five different pages until I found, you know, actual examples.

The outline on the left is worthless. It obscures more than it shows. If I have to "open" a level of the outline, and all that does is show me what's on the page I'm currently looking at, then I'm not getting an index, I'm getting an index-shaped-thing that might help me once I'm completely familiar with the material. It's not searchable in-page using Ctrl-F, not when the content is hidden. As someone who learns best looking at other people's code, this was an exercise in frustration, and it really needs to be re-thought.

Current Mood: annoyed
Current Music: Archive, Controlling Crowds
Crossposts: http://elfs.livejournal.com/1561818.html

We have an phrase in the programming world, the Dark Programmer. The term comes from the analogy with dark matter: we know it exists because it exert gravitational force on our galaxy, but other than that we have no idea what it is. A dark programmer is like dark matter: we know they exist because someone keeps writing Java-based actuarial software for insurance companies, banks, hospitals, and other large institutions, but these aren't the sort of people who post to GitHub or Bitbucket, don't contribute to Stack Overflow, and generally aren't interested in advancing, or even learning much about, the state of the art. They just want to do their job for the day, go home, and not think too much about what they do.

It occurred to me, while watching the credits roll by for the movie Frozen, that there seem to be Dark Artists as well. Looking through the list of all the various artists, digital CGI, and ink-and-paint animators listed on the film, I proceeded to go through all of them to find out how many had some sort of presence on the web related to their love of their art.

About half. Almost all of them had some presence: they have LinkedIn accounts, IMDB entries, and the like. But only about half the artists had Deviant Art, Tumblr, Blogspot, or some other account where they shared process drawnings and discussed their work with other people. (Mostly Blogspot. Which I think is weird. Is there something in the TOS of Blogspot that makes it more desirable for arists than the others?) Which means that half the artists on the biggest animated film of the year just want to do the work, take their paycheck, go home, and not think too much about what they do for a living.

The programming world needs dark programmers; I wonder if the same mindset is in play for animation departments. After all, not every who went to art school came out as passionate as when they entered.

Current Mood: amused
Current Music: Frozen OST, For The First Time
Crossposts: http://elfs.livejournal.com/1557573.html

This afternoon, I had a problem. Django is a popular tool for building websites, and the Django administration tool is designed to show individual tables from a database and lists of their contents. I had a fairly complicated problem whereby a related, generic table was supposed to have-- but may not have had-- data associated with items in my target list. This relationship is tenuous, meaning the database doesn't understand it natively; there has to be some external process that manufactures the relationship with every request for data. Django does this fairly nicely with a plug-in, but the plug-in doesn't always play nice with the administration tool.

I spent about two hours puzzling it out until I finally found the answer on Stackoverflow. When I found it, I facepalmed.

I had written that answer. In fact, not only had I written that answer, but at the time I had written it, barely 14 months ago, I was considered one of the world's leading authorities on doing weird things with the Django administrator.

I've forgotten more about Django than most Django heads have ever learned. Unfortunately, "forgotten" seems to be the operative word today.

Current Mood: chipper
Current Music: Mark Mancina, Speed OST

I used to joke that, sometimes, when someone in my family comes down and sees a mess of code up in Emacs on my screen that "No, really, I'm playing a video game. It just looks like work." Because I find coding fun. But the fact is that I also play games, and Portal 2 was definitely a brain-bender.

But not as brain-bending as Haskell.

That said, I just finished the first exercise in Seven Languages in Seven Weeks, although really I'd think it's more like Seven Languages in Seven Days, given my ferocious language consumption-- I just skipped right to the Haskell part, because I have a project that requires Haskell at the moment. Still, this was pretty cool. Don't read further if you don't want the answer.

( Code behind cut. )

Current Mood: geeky

I wish I’d known this a long time ago. Django’s request object includes a dictionary of key/value pairs passed into the request via POST or GET methods. That dictionary, however, works in a counter-intuitive fashion. If a URL reads http://foo.com?a=boo, then the expected content of request.GET['a'] would be 'boo', right? And most of us who’ve used other URL parsers in the past know that http://foo.com?a=boo&a=hoo know that the expected content of request.GET['a'] would be ['boo', 'hoo'].

Except it isn’t. It’s just 'hoo'. Digging into the source code, I learn that in Django’s MultiValueDict, __getitem__(self, key) has been redefined to return the last item of the list. I have no idea why. Maybe they wanted to ensure that a scalar was always returned. The way to get the whole list (necessary when doing an ‘in’ request) is to call request.GET.getlist('a').

Lesson learned, an hour wasted.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Recently, I had the pleasure of attending another of those Seattle Django meet-ups. This one was a potpourri event, just people talking about what they knew and how they knew it. I revealed that I’d written my first aggregator, and that seemed to be an impressive statement. Apparently Django Aggregators (database conditionals that perform sub-selected summarizing or filtering events) is something of a black art, much like Wordpress Treewalkers were a black art I figured out in just a few hours.

Aggregators consist of two parts: The Definition and the Implementation. Unfortunately, Django’s idea is that these are two different objects, bound together not by inheritance but by aggregation (both the definition and the implementation are assembled in a generic context, one providing access to the ORM and the other to the SQL). The definition is used by the ORM to track the existence and name of the aggregate, and is then used to invoke the implemenation, which in turn creates the raw SQL that will be added to the SQL string ultimately sent to the server, and ultimately parsed by the ORM.

I needed to use aggregation because I wanted to say, “For any two points’ latitude and longitude, give me the great circle distance between them,” and then say, “For a point (X, Y) on a map, give me every other place in the database within n miles great circle distance.”

The latter was not possible with Django’s Queryset.extra() feature. You can add a WHERE clause, but not a HAVING clause, and this definitely requires a HAVING clause when running on MySQL. Using an Aggregator with a limit forces the ORM to realize it needs a HAVING clause. Besides, it was a good excuse to learn the basics of Aggregation. Ultimately, I was able to do what the task required: find the distance between any two US Zip Code regions without making third-party requests.

I make absolutely no promises that this code is useful to anyone else. The Aggregator is definitely not pretty: it’s virtually a raw SQL injector. But it was fun. Enjoy: Django-ZipDistance.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

I was looking at the Digg source code the other day and, I’ve gotta say, mega-props to the developer of their javascript. The traditional rule in web development has become, “Put your Javascript at the end of your page.” That way, all of the DOM objects you might refer to are guaranteed to be present when you start referring to them. This had been a chronic problem in web development, associated with much pulling of hair and gnashing of teeth. The more modern way is to load your scripts up front, but use jQuery’s $(document).ready(...) method to ensure that nothing would run until your DOM tree was loaded. You might still have to wait on media assets (you used to be able to say “images”, but with HTML5 media assets include sound and video), but the DOM will be there and ready. You’ll be free to set up onload events to the media assets. The $(document).ready(...) technique ensures that your javascript will run correctly but it still has one classic problem: download delay. If you want an anchor to do something more than just be a link when you click on it, you’ll have to live with the chance that the user will click on it before the javascript has run and it will just be a link. Digg doesn’t live with that. Instead of using $(document).ready(), they do something much more clever. In the header of their HTML document, they create unique event names that signal when some part of the page has been handed to the browser. You can do this with jQuery, which lets anything at all be an event. Let’s say you have an anchor that you want to do something funky when you click on it, something Ajaxy. First, in a javascript library that you load in the header, you define the function:

function dosomethingfunky() { /* ... do something funky with $(this) ... */ }

Then you bind it (note the completely custom event name):

$(document).bind('somethingfunky_ready', dosomethingfunky);

And finally, in the body, when the anchor has been loaded into the DOM, you trigger the set-up right there:

<a href="http://someexampleurl.com" id="somethingfunky">Funk Me!</a>
<script type="text/javascript">$(document).trigger('somethingfunky_ready');</script>

Now that’s cool. Your race condition window is down to milliseconds; the event handler is probably set up even before the browser has a chance to render the object. Yes, there’s a price: your page is littered with triggers and unobtrusiveness is a thing of the past. But if you need this kind of set-up and performance, this is how it’s done, aesthetics be damned.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Following up on the The Onion’s announcement that they’re using Django comes this priceless discussion of the technical challenges of doing so with several members of The Onion’s technical team. They were using Drupal before.

Among the things I discovered:

Grappelli, a customizable theme for the Django admin
uWSGI, a high-performance WSGI container separated from Apache,
HaProxy, a viable open-source TCP/IP Load Balancer.

I’m constantly reminded, when I work on IndieFlix, “You’re not YouTube. Don’t code like you are ever going to be YouTube.” And they’re right. If I ever reach that level of technical difficulty, I’ll deal with it then. But we very well could have traffic issues similar to The Onion’s, and that’s not a bad target to aim for.

Also not to be missed in this conversation: The Onion cut 66% of their bandwidth by upstream caching 404s

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Giggle has become my new best friend. Giggle is a graphical front-end for Git: you start it in a Git repository and it keeps excellent track of all of your branches, their history, mergings, and so on. Since one of my big initiatives in my current position has been refactoring an inappropriately large model class (multiple things being modeled in one class, and a metric ton of combinatorial excess around different kinds of people being modeled) and its attendent view (another, orthogonal mess of combinatorial excess about different lists-of-lists, not to mention a couple of one-shot views on the bottom that really need isolation), I’ve been making a variety of stabs towards cleaning them up.

Git tracks not versions but changes, and there are changes galore: one branch here is master, but there’s merge_roles (take the five common roles in our model– all of which have the exact same signature– and provide a simple integer field for the role-type), merge_filters (“lists of lists” views partitioned into a get-list function followed by a generic draw-list function), new_toolbar (some advanced navigational features in the look’n'feel wrapper), new_homepage (break out those one-shots that need isolation), and new_deployment (our old deployment needed a bit of an upgrade).

If I were doing this by hand, I’d go insane with patchfiles. But with Git, it’s simple. And with Giggle, I can actually see how simple it is.

There are some caveats, but they’re more about using git svn than about using giggle. Always rebase, never merge, is probably the most important: Subversion has no idea what a git branch is all about, so as long as you rebase your master with whatever experimental branch you’re bringing into the mainline before you run git svn dcommit, all will be well.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Apress’ Pro Git is a better book than O’Reilly’s Version Control with Git. The O’Reilly book tries hard to educate you about the repository, but goes off into the weeds with details about history and branch management that overwhelm a user who “just wants to use the damn thing.” The Apress book has a section on what goes on within the repository, complete with illustrations of blobs, pointers, trees, and so forth, but by eliding the blobs when talking about branches, and only going into detail when necessary, it makes the branch and merge process much easier to understand than the O’Reilly book.

Also, since I’m using Git on top of SVN, the Apress book’s section on using Git with other repository software is sufficiently technical without, again, overwhelming you with the details.

Kathy Sierra once posted about Just In Time vs Just In Case Learning. The O’Reilly book quickly becomes a “just in case” book, whereas Apress’ book is much more “just in time.” The Apress book supports my mantra, courtesy of Bre Pettis: “Pretending you know what you’re doing is almost the same as knowing what you are doing, so just accept that you know what you’re doing even if you don’t and do it.”

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Gwaredd Mountain writes:

Microsoft has published empirical data that shows that the process overhead for TDD increases the development effort by 15% – 35%. Despite the many positive benefits from TDD, we cannot possible consider anything that adds an extra 35% effort to produce artefacts the customer will never see as lean. Amazingly, people still try and associate this with “agile”.

That was, more or less, exactly what went wrong with the social networking group with which I was involved. We were sold both Agile and TDD at the same time, and the contradiction between the people oriented development cycle of Agile (something at which I excelled at CompuServe), and the process-driven development cycle of formal TDD, is what doomed the project and made it limp along without a real direction. Nobody ever stopped to ask what the whole thing was supposed to do; their were just these modules, all encrusted with a half-dozen tests, each ensuring that the module it tested conformed to some specification that existed only in the tester’s mind. My mind included.

I believe that some process is important, even inevitable. But I could see the inherent friction in the TDD and Agile development processes that paralyzed a company’s forward development progress. I think the 37 Signals approach of “It’s a problem when it’s a problem” is one of the best pieces of advice I’ve ever seen, or as I put it, “I write code until I do something stupid, then I write a test to make sure I don’t do that stupid thing again.”

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

I have a contract that I’m working on that requires I work with rails. That, in itself, isn’t so bad. But I think what bothers me most about rails can be summed up in one word: partials. For example, let’s say I have the following:

render :partial => 'employee', :collection => @employees

What this means is that the files _employee.rhtml, using the internal variable employee, will iterate multiple times over the collection employees. The “magic” here is that the internal variable and the partial name coincide. This is called, in rails, using convention over configuration. And while it makes perfect sense, it is in some sense straitjacketing. Yes, I know, people will tell me that the internal variable name can be changed with the :as symbol; that’s not the point. Ruby is such a malleable language that rails almost seems anathemic to ruby in the first place: why use what is just about the most flexible language in the first place, and then create a set of conventions which must be memorized in order to make the thing go?

I kinda like ruby and rails, but they don’t seem to belong to one another. It feels very much like a marriage with a mail-order bride.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

I had a job interview today, and one of the “challenges” with which I was presented was this: “We own several sites. We would like our user to be able to log into the central site as a subscriber, and then all the other sites will know what permissions that user has.”

The sites are cooperative, so getting content onto each one to support this idea isn’t difficult. Also, all of the sites belong to the same user, so getting a regular framework that you can deploy to all of them isn’t difficult either.

The scenario we want to support is this. We have two websites, the remote authentication server framework. Let’s call it “rasf.com” (which, sadly, does not go to rec.arts.sf.written), and let’s call the site that wants authentication “brandx.com.” (Sadly, this is a parked site that leads to a stock photography reseller.) The idea is that you’re a subscriber to rasf.com. You visit brandx.com, and Brand-X initially has no idea who you are. We want Brand-X to be able to say, “Hey, Rasf, who is this guy?” and have Rasf come back “Oh, he’s John Smith, he’s a legit user, and here are some details.”

The trick involves public key cryptography. Both Rasf and Brand-X have public and private keys. To understand the scenario, you must appreciate that much of the heavy lifting is done away from both sites on the browser. The problem is that, on the browser, any windows opened between Rasf and Brand-X cannot communicate with each other; this is known as the sandbox, and it keeps malicious sites from using iframes or Ajax to either inject malware Javascript into the victim page or allow the nefarious iframe from ripping data (like username and password keystrokes) from the victim page. We want to violate the sandbox, but how?

Assumption: You’ve visited Rasf recently, and have a cookie from Rasf saying “Yes, I, Joe Smith, and a valid user of Rasf and affiliated sites!”

You access a page from Brand-X. Secretly, Brand-X creates a one-pixel-wide iframe and set the iframe’s src attribute to the “authenticate this user” page on Rasf, including in the request its public key as an argument. After the iframe loads, the loaded page from Rasf now has the session information on the browser, and it has Brand-X’s public key. The session information includes the Rasf cookie. So now, Rasf knows two things: it knows who you are, and it knows that Brand-X wants to know who you are.

How does that who you are information get back to Brand-X? Here’s where the cooperation comes from. The iframe from Rasf, using the onload() event, creates yet another iframe, this time calling back to a specified page (the cross-domain receiver page) on Brand-X’s site, and that URL is loaded with your user ID, a cross-domain session key, and other information, all signed with the Rasf private key (so the Brand-X site can unpack with Rasf’s known public key).

Now, because both the containing page and the iframe two layers in are in the same domain, they’re in the same sandbox, and can communicate with one another via javascript. The innermost iframe communicates with the cookies of the outermost page, the calls reload: all this information has now been pushed back to the Brand-X server, which can now use the signed cross-domain session key to make back-end requests of Rasf’s web services API and say, “Okay, now that I know he’s user 12345, and I have a session key validating this conversation, what else can you tell me about him?”

There are lots of other details here. What if he’s not logged in to Rasf at all? Well, enlarge the 1-pixel iframe to a size big enough to show a log-in within the Rasf domain, get his username and password, authenticate and proceed as before.

This is a generic description of what Facebook Connect does, and it’s how you can do it as well.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Java is Pass-By-Value, Dammit!

Quite possibly the most important article I’ve ever read, because it finally, finally explains to me what Java’s object-passing model is really all about. I’ve never understood it, and now I do: it’s exactly backwards from pass-by-reference, so it’s exactly backwards from the languages with which I grew up. The Object (and derivative) variables are pointers, not references, and calling them references only confuses people who grew up writing C (as I did).

Even more oddly, it explains to me what I’ve never quite understood about Python’s object model, because Python’s object model is exactly the same. Reproducing the code in the article above in Python creates the same result:

class Foo(object):
    def __init__(self, x): self._result = x
    def _get_foo(self): return self._result
    def _set_foo(self, x): self._result = x
    result = property(_get_foo, _set_foo)

def twid(y):
    y.result = 7

def twid2(y):
    y = Foo(7)

r = Foo(4)
print r.result  # Should be 4
twid(r)
print r.result  # Should be 7

r = Foo(5)
print r.result  # Should be 5
twid2(r)
print r.result  # Still 5

This demonstrates that Python’s code remains pass-by-value, with pythonic “references” in fact being pointers-to-objects. In the case of twid2, we change what the pointer y, which exists in the frame of the call twid2, points to and create a new object that is thrown away at the end of the call. The object to which y pointed when called is left unmolested.

This is important because it changes (it might even disrupt) the way I think about Python. For a long time, I’ve been using python calls out of habit, just knowing that sometimes objects are changed and sometimes they aren’t. Now that the difference has been made clear to me, in language that I’ve understood since university, either I’m going to be struggling for a while incorporating this new understanding, or I’m going to be much more productive.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

I’ve always been a little leery of studies that show that somehow, a bigger monitor equals more productivity. Well, count me as no longer leery. I’ve been hacking on a 24″ monitor I bought at a Christmas sale yesterday, and already I’m going along significantly faster than I was before. For one thing, I can now have both Firebug and the screen I’m interested in visible at the same time. That alone makes me twice as effective as before. Being able to do both and have the source code editor visible at the same time? Priceless.

Seriously. If you code at home and your monitor is still 19″, do yourself a favor and go buy a bigger one. Or, as a cheap alternative, if your card supports it, buy a second monitor and dual-head. Whatever you do, get the capacity to have all your work visible: output, product, debugging information in one glance.

(Yes, I know, the title is tragically bad SEO, but I couldn’t hold back from a Hellraiser quote.)

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Yes, that’s a signal boost.

I’ve only played with Node.js for about 24 hours now, and I’m already deeply impressed with it. Node.js is something of a holy grail: an implementation of server-side (and desktop) Javascript with a modern engine (Google’s V8), in which all I/O is event-handled. You no longer care about multiplexing, spinning off threads, or any of the myriad gazillion other cares that server developers used to worry about. Instead, that’s been built into a reactor core inside Node.js, and instead you receive notice of events (just like in browser-based Javascript), where the events will things like “header received,” “body received,” “message end,” “connection made,” and to which you’ll be able to attach enclosured functions that will respond appropriately. It’s an application server’s base language on acid, and while I’m bad at making predictions, I suspect Node.js will be with us for a while.

Already there’s a NoSQL database interface, the start of an application server, the start of a CLib-like library (like Python or Perl’s standard library), and even a comet server. This could be fun.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

So, I got tired of the way Django-SocialAuth was borked and not working for me, so I forked the project and have put up my own copy at GitHub.

There are three things I noticed about the project right away: First, it forces you to use a broken templating scheme. I haven’t fixed that, but in the meantime I’ve ripped out all of the base.html calls to keep them from conflicting with those of other applications you may have installed. Really, the templates involved have very little meat on them, especially for the login.html page. These are components that would have been better written as templatetags. Second, the project is rife with spelling errors. (The most famous, of course, being that the original checkout was misspelled “Djano”). I am a fan of the notion that a project with spelling problems probably has other problems. I’ll make allowances for someone for whom English is a second language, but I was not filled with confidence. And third, the project violates Facebook’s TOS by storing the user’s first and last name. Along the way I discovered that the Facebook layer was completely non-functional if you, like three million other Facebook users, had clicked “Keep me logged in,” which zeros out the “login expires” field from Facebook. It would never accept you because your expiration date would then always be January 1, 1970, effectively before “now.”

I’ve barely begun hacking on the beast, but already I’ve made some progress. Facebook now works the first time around. I’ve cleaned up much of the spelling and grammar in the documentation, such as it is, and I’ve clipped many of the template naming problems that I saw in my original use of the system. I’ve also revised setup.py so that it runs out of the box, although I’m tempted to go with a different plan, one like django-registration where it is your responsibility to cook up the templates for the provided views. And I’ve ripped out most of the Facebook specific stuff to replace it with calls to PyFacebook, which is now a dependency.

One thing I do want to get to is a middleware layer that interposes the right social authentication layer on those users who come in from the outside world: i.e. if the AuthMeta indicates you’re a facebook user, then request.user will be a lightweight proxy between you and Facebook for those fields that are, by definition, Facebook-only (and a violation of the TOS if you copy them). It might make more sense to have a decorator class, but only if you don’t have a gazillion views.

I haven’t gotten much further than a Facebook layer that satisfies my immediate needs. I haven’t had a need to branch out and explore the Twitter or Oauth components yet. What I needed at the moment was a simple authentication layer that allowed either local users (for testing purposes) or FacebookConnect users, and one that didn’t need to contact Facebook for absolutely every view, whether you wanted it or not, just to check “is this guy still a facebook user?”, which is how the DjangoFacebookConnect toolkit does things. I suppose, if you’re a Facebook app, that’s what you want, but I’m not writing a Facebook app, I’m writing an app that uses FacebookConnect to associate and authenticate my application users’s accounts via their Facebook accounts.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Parsing HTML with regex summons tainted souls into the realm of the living.

If you hack HTML for a living, this will make you giggle. And given that I’ve used regex in my tests to assert the presence of classes and objects in a page, I guess I’m guilty.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Google last week released Google Go, a new programming language designed to be “more C-like” (a kind of python/C#/Java mash-up of a language) with support for Erlang’s excellent concurrency management, which allows you to write programs that can do lots of different little things at the same time on machines that have multiple CPU’s. It’s odd that Google that should come out with this at about the same time that I started looking into Python’s high-concurrency variant, Stackless, because both Stackless Python and Go purport to do the same thing: bring high-quality concurrency management to the average developer in a syntax more familiar than Erlang’s, which is somewhat baroque.

While looking through both, I came across Dalke’s comparison of Stackless Python versus Go. Not even taking into account the compile time (which is one of Go’s big features, according to Google), Dalke compared both and found that Stackless Python was faster. His comparison, he wrote, maybe wasn’t valid because the machines on which he ran some of the comparison were close, but not the same.

So, here’s the deal: on a Thinkpad T60, running on the console with no notable load whatsoever, I ran Dalke’s two programs, both of which spin off 100,000 threads of execution and then sum them all up at the end.

The “real” time to run the Go program was 0.976 seconds (average after ten runs). The “real” time to run the Python program was 0.562 seconds (average after ten runs). The Python program, which must make an interpretive pass before executing, was almost twice as fast as the Go program. (It was 1.73 times as fast, which matches up approximately with Dalke’s 1.8).

In fact, you can see the consequences of this in the way the other figures come out: the amount of time dedicated to CPU for both is wildly different. Both spent approximately the same amount of time executing the user’s code (python: 0.496 seconds, go: 0.514 seconds), but time spent by the kernel managing the code’s requests for resources is way off (python: 0.053 seconds, go: 0.446 second).

It may be that Go is simply doing something wrong and this can be fixed, making Go a competitive language. An alternative explanation is that Stackless Python has an erroneous implementation of lightweight concurrency and someday it’s going to be found and high-performance computing pythonistas are going to be sadly embarrassed until they fix the bug and come into compliance with Go’s slower but wiser approach.

But I wouldn’t bet on that.

Kent Beck recently said of Google Wave, “it’s a bad sign for wave that no one can yet say what it is indispensable for. not just cool (it’s definitely that), but indispensable.” The same can be said of Google Go: There’s nothing about it I can’t live without, and other implementations of the same ideas seem to be ahead of the game. Stackless Python not only has Erlang-esque concurrency and reasonable build and execution times, but it has a familiar syntax (which can be easily picked by by Perl, Ruby, and PHP developers), a comprehensive and well-respected standard library, and notably successful deployed applications.

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com