Issues with Active Scaffold and Rails 2.1, Solved

June 28th, 2008

If you are looking to upgrade to Rails 2.1 and you are using Active Scaffold, be aware it still has a few rough edges.

Make sure you install the Rails 2.1 branch of Active Scaffold:


git clone git://github.com/activescaffold/active_scaffold
cd active_scaffold
git branch -r (just lists the branches)
git checkout origin/rails-2.1

(Check out the full thread for Rails 2.1 and Active Scaffold compatibility)

Second, I had to patch vendor/plugins/active_scaffold/lib/extensions/generic_view_path.rb. On line 53, make it look like this:


if !@template.controller.is_a?(ActionMailer::Base) && @template.controller.class.uses_active_scaffold?

I had to add !@template.controller.is_a?(ActionMailer::Base) &&

All my tests are now passing!

I still love Active Scaffold, even though it’s a bit behind the times.

Scalable Counters for Web Applications

June 28th, 2008

So you need to provide a count or counter for your web application, but you want it to scale. The naive approach would be to simply select count(*) from table. That will fail under load because it requires scanning your entire collection.

The first question you need to ask is, Do you need exact counts or will approximate counts be good enough? I bet in many situations, an approximate count will be perfectly reasonable. Think about the use case of tracking web hits. When you’re talking about millions of hits, what is the difference between 1,000,000 and 1,000,001? Of course, only your business expert will know if approximate or exact answers are required. The decision, though, is crucial because it’s the difference between an easy implementation and a hard (costly) implementation.

Let’s say, for the purposes of this article, that you’ll need very close to accurate counts, plus you need to scale a lot. The first step is to pre-calculate the count, and cache the results. When a new web hit occurs, grab the current count, add one, and put it back. This approach will scale for a while, but the chance of missing a count goes up as load goes up. Because we’re not explicitly locking on the row (which can be expensive), the last person to write the record back to the database wins.

The next option is to wrap the “grab record, increment, put record back” inside a locking transaction. This will ensure that only one writer can access the counter at a time. This ensures an accurate count, but will greatly slow down the site as contention around the single counter increases.

The third option, and the best option, is to split the counter up into smaller counters. When it’s time to get the full, single count, simply grab all the counter partitions and add them up. For very high loads, increase the number of partitions. The theory is it’s quick to add up 100 partitions, while you’re providing 100 different counters to lock around.

How do you pick which partition to increment? One easy way is to create a hash of the timestamp (or some other part of the request that changes frequently) of the request, and mod it on the number of partitions in the system. The theory here is you’ll be spreading the load across the partitions as the number of concurrent requests increases.

In any scalable web system, reads should be by key and writes are expensive. Do whatever you can to read a single object by a key, and minimize your writes. Minimize the contention around objects in the data store, too. Realize that ad hoc queries can almost always be implemented by pre-calculating the answers, so that an ad hoc query is simply retrieving a record by a key (instead of scanning through all rows, computing the answer as you go.)

For more on this technique, I recommend the excellent video Builing Scalable Web Applications with Google App Engine.

Announcing Whatever Is Fine With Me, The Easiest Friend Polling Site

June 27th, 2008

World, meet Whatever Is Fine With Me. I think you two will really hit it off.

Whatever Is Fine With Me is a easy, quick polling or voting application that you can use to find an answer to a question faster than you can sing the Canadian national anthem.

The site was created after we found it difficult to make ad hoc decisions among our group of friends. Most famous is “where are we going to lunch?” Of course, the most common answer to that is: “whatever is fine with me.”

I wanted to create a site that is task focused, fast to use, and with as close to a zero barrier of entry as possible. I am sick of creating accounts on sites, so my first rule was banish account signups. It’s still fairly secure, as each page is identified by a very hard to guess URI.

Why use Whatever Is Fine With Me?

  • No login required for you or your friends
  • Extremely fast and easy
  • Totally and completely free

I’d love to hear feedback, please post any comments to the Whatever Is Fine With Me Forums. If enough people begin to use it, I’d be happy to add features.

Some new features I was thinking about or have been suggested:

  • Make “whatever is fine with me” an explicit option when choosing your votes
  • Remember your friend list, so you don’t have to type it in multiple times
  • Remember all your past questions

So please give it a shot and I hope it’s useful!

(I’m posting this to see if anyone wants to use it. I’m really hoping so, so I can justify spending time on it.)

And while Whatever Is Fine With Me is perfect for asking your friends questions, if you want to ask the world a question, you definitely want to check out Ask 500.

Proposed Enhancements For Web Browsers

June 10th, 2008

I was listening to Muxtape (quite possibly the best user interface for a web application). I use Firefox and I have lots of tabs open. Muxtape is playing the background, in another tab. Sometimes a song comes up on Muxtape that I don’t like and wish I could skip, however, I’m in “the flow” and don’t want to leave my current tab or application.

I’d like to propose an enhancement for web browsers to simplify the interaction with tabs in the background. Web applications should be able to specify a small menu of commands which can be executed from the tab without having to pull that tab into the foreground.

I’d love to be able to right-click on the tab and see options such as “Skip”, “Pause”, “Back”, or “Repeat”. This context relative menu is specific to each tab, and by clicking on any of the options, a Javascript function would be called.

I envision this as easily specified as part of the larger effort of HTML 5 to address modern day requirements of web applications to offer a richer experience.

Google’s App Engine To Force Me To Learn Django

April 7th, 2008

Google App Engine was just announced. Scalable, integrated applications hosted on Google’s cloud. I’m particularly interested in the Datastore. Take that, Amazon SimpleDB, Datastore supports ordering result sets!

After a tour through the examples, it’s safe to say that App Engine is not a direct competitor to Amazon’s general purpose cloud offerings, such as EC2 and S3. App Engine is targeted directly at request/response web applications. Amazon’s offerings are much more generic, allowing you to run whatever you want on EC2.

App Engine’s data store is closer to a traditional database than Amazon’s SimpleDB. With App Engine, you have a query language that looks and smells familiar. SimpleDB is a distributed hash. Both technologies are useful, but if you’re comfortable with SQL, you’ll find App Engine’s Datastore more friendly.

Also interesting is that App Engine has direct support for Google Accounts. Don’t want to write Yet Another Account System? With just a few lines of code, you have deep support for Google Accounts. Note that nothing prevents you from writing your own account management features. Too bad App Engine didn’t have direct support for OpenID.

What’s kind of cool is that you can lock down your application to users of a particular Google Apps domain. This might be useful if you wish to write an application for your company or organization. Nice touch.

Now, if I can only get my beta account approved! My account is approved! I can’t wait to test the scalability of the Datastore.

When you register an application with App Engine, you have the option of binding your own top level domain (eg example.com). You do this through Google Apps. If you don’t have your domain yet, it’s trivial to purchase the domain name through Google Apps (via GoDaddy). This is very handy because all the work is done for you behind the scenes. I registered and new domain name, bound it to Google Apps, and then to my App Engine application within 10 minutes.

Time to downloading the SDK to full domain registration and application upload to running application: 1 hour. That was amazing. Of course, it doesn’t do anything yet. But that’s pretty cool.

Nest Those Rails Resources Or Make Baby Semantic Web Cry

April 2nd, 2008

Proper web architecture dictates that a you should “Assign distinct URIs to distinct resources.” And Cool URIs for the Semantic Web states that:

There should be no confusion between identifiers for Web documents and identifiers for other resources. URIs are meant to identify only one of them, so one URI can’t stand for both a Web document and a real-world object.

So we know that a URI should refer to one and only one resource. (Of course, you may have many URIs all referring to the same resource.) So why do so many web sites have URIs like http://www.example.org/myaccount? That same URI is used to refer to any account in the system, depending on who is logged in. And that makes Baby Semantic Web cry.

Why is the baby sobbing? A generic URI like http://www.example.com/myaccount isn’t useful on the semantic web, because it’s very difficult to make meaningful statements about that URI. Let’s go and try.


http://www.example.com/myaccount is the account page of "Seth Ladd".

and


http://www.example.com/myaccount is the account page of "Bob Smith".

Hmm… so http://www.example.com/myaccount is the account page for both Seth and Bob? That doesn’t make much sense!

A better URI for an account page would be http://www.example.com/accounts/23232, which is easily unique for every user.

The moral of this story is that every one of your URIs should be unique. So let’s bring this all the way back to Rails and resources.

When building your resources, ask yourself, “If I GET this URI, will I see the same thing no matter who is logged in?” If the answer is “No” then you need to nest your resources so that the URI is unique and the same representation is returned no matter who you are logged in as.

For example, a typical URI would be http://www.example.com/books, which could easily be a collection of books for the user. The contents of that URI are relative to the person logged in, so we have a problem. To fully qualify the URI, we need to nest books inside of the user collection. We end up with http://www.example.com/users/1/books, which is unique and follows web architecture best practices. Now we can say unambiguous statements about the URI, thus populating the semantic web with more useful and meaningful triples.

The Semantic Web in Action Article From Scientific American Online and Free

March 31st, 2008

The Semantic Web in Action, originally published in the December 2007 issue of Scientific American, is now online and free. The original article published in the May 2001 issue of Scientific American was certainly due for an update.

The original article made a lot of grand promises, while the December 2007 article details current efforts at applying semantic web technologies to real life problems. Check it out if you’re interested in how companies are building the semantic web today.

What If We Ran an Iron Coder?

March 25th, 2008

I’ve been a fan of Iron Chef America for a while. Fast paced and some very interesting dishes, it’s entertaining and even a bit educational (for the epicurean viewer).

Being a geek at heart, it leaves me wondering what it would take to create an Iron Coder competition. With the right “ingredients” it just might work.

First, we’d need a play-by-play announcer and a color commentator. On Iron Chef America, this single role is played by Alton Brown. We might be able to get away with a single person, but I often like the banter of two announcers. It is, of course, their job to explain what is going on and provide insight and entertainment during the battle.

There is, of course, the Secret Requirement. This brings us to the question of what type of code are the two Iron Coders creating? I come from a web application background, so this is my first assumption. You can’t pit an X-Box programmer against a Perl script kiddie. For now, let’s stick with building simple web applications. I don’t think we restrict to a particular technology. In fact, part of the fun would be to watch a Rails expert go head to head against a .NET expert.

As for the Secret Requirement, this can go one of two ways. Option A would be to mandate a large scope for the application. For instance, “Build a Time Card application!!!” The particulars are left up the Iron Coders. Perhaps there are a very small set of requirements handed down, like “user clocks in” and “user clocks out” and “manager pulls report of this week’s time”

Option B for Secret Requirement would mandate a very small requirement, such as “must use the visitor pattern and two factory pattern implementations!!!” This would let the Iron Coders build whatever they like, as long as they use the Secret Requirement. This more closely matches the original Iron Chef in intent, but how easy is it to create these small requirements? Do they provide enough constraint for the Iron Chefs?

I’m going to lean towards Option A, specifying a broad, yet simple, application domain. Leave the particulars up to the Iron Chefs.

Next up we have Judging. This is where it gets interesting, as need to decide how to choose a winner. In Iron Chef America, they judge the dishes with a point scale across three categories: Originality, Taste, Presentation. For Iron Coder, I’d propose the categories to be Originality, Accuracy, and Construction. Let me explain:

Originality would be the judge’s take on how original the Iron Coder implemented the Secret Requirement. The more interesting, unique, surprising the Coder’s web application, the more points here.

Accuracy is measuring the correctness of the application. This one is tough because there may not be that many formal specs, but given that Accuracy is a judging category, we might need them. In any case, Accuracy measures if the application functions properly. If any bugs or inconsistencies are encountered, points are lost here.

And finally Construction, which is measuring the quality and beauty of the code itself. This is a lot like porn: you know it when you see it. Is the code DRY? Does it use patterns appropriately? Does it follow good OO design? Is it a hack, or is it beautiful? If anything, this category is too broad. In any case, it’s very important and must be judged.

Logistics is something I worry about. Watching people write code isn’t exactly exciting, but I think this even should be live. I’d like to emulate a live studio audience, where viewers can chat along with the action. However, I don’t think the Iron Chef’s should be able to watch the chat (not sure how you’d accomplish that one) This is my biggest unknown. What’s a good screen sharing program? How do we deal with small font sizes? Are there any editors that provide a real-time link (SubEthaEdit comes to mind for collaborative editing)?

Finally, it has to be campy. This should be fun for Iron Coders and viewers alike. I think if we can figure out the logistics issues of actually broadcasting text editors, it could be a fun event.

REST, Hypermedia, and JSON

March 23rd, 2008

Enjoying Sam Ruby’s latest post titled Connecting, I was reminded that designing a RESTful system means much more than just adhering to a uniform interface (often the first attribute of a ROA that is promoted).

When designing a Resource Oriented Architecture, you mustn’t forget about hypermedia and how your resources link to one another, and how those links are expressed through your representations.

Which brings me to JSON. A nifty little format, indeed. And one that, if you are building a modern web service, you should be investigating and implementing. However, if you are indeed building a hypermedia system with JSON, how do you express your links?

I know how to do this in XHTML and RDF, but not sure how to express or render a URI and have it mean “link”.

I’ve love to be able to do something like this:


{
  "name": "Cool Beans",
  "account": "http://example.org/accounts/23242342"
}

However, any JSON client will have a tough time determining what the meaning of the account value is. Sure, it’s a string. But how should it be treated? And how do I express that to my JSON clients? I’m not about to give them a regular expression and say “if it matches, it’s a URI, and follow it!”

Thoughts? How to express hypermedia in JSON?

Why Flickr Doesn’t Do FOAF

March 23rd, 2008

Tim Berners-Lee asks “So do you think Flickr could be persuaded to source FOAF?”

Given what I’ve heard from Stewart Butterfield (co-founder of Flickr), the answer is a No.

Back in 2004 (Mon, Nov 29, 2004 at 8:41 PM to be exact), I wrote Flickr asking if they could add sha1 hashes of user emails (in an obvious attempt to be able to convert the data into FOAF). Here’s the original request email:

Hello,

Would it be possible to add a sha1 hash of a person’s email address to
the response of flickr.people.getInfo ? I understand that we don’t
want to give out email addresses, and it’s nice that the API doesn’t
expose them. But to help in uniquely identifying users across
systems, a good identifier is often their email address. To safe
guard against spam, creating a SHA1 hash is a good way to hide the
email, yet still provide a unique identifier for the user.

This sha1′ed email address becomes a candidate key to the user, so to speak.

Thoughts?

Thanks!
Seth

To which Stewart replied (and I have his permission to quote him):

Seth, I guarantee that the problem is not that we don’t know how to
provide the functionalty - as you say, it’s easy.

It’s more that it has a lot of complications at the social level. How
do you know whether any of our users *wants* their Flickr profile
(potentially filled with cool, beautiful or emotionally important
family photos) to be associated with their Tribe profile (potentially
filled with descriptions of their kinky fetishes)? I know I don’t want
my professional profile on LinkedIn tied to my clownish profile on
Orkut.

Remember http://beta.plink.org/ ? … read about why it shut down. A
lot of those lessons apply to us. I think Dan Brickley is a super guy,
and I think FOAF is well intentioned. But I also think it has nothing
to do with Flickr (or even Tribe/Orkut/Friendster/whatever).

Last, since approximately 0% of users want or care about this
functionality, it’s not a good deal for us to implement it. It’d be
really neat if there were a machine-readable description of who I am
and what I’m up to online tied to a single idetifier, enabling
software that could make all kinds of inferences about me and tie all
kinds of facts about me together. On the other hand, that would really
suck. If you know what I mean.

We don’t even want to get into explaining to people what this is, let
alone build a UI to allow them to opt out, etc., etc.

I appreciate your enthusiasm, and I know you’re coming from the right
place, but it’s just not something we’re willing to support right now.
(And you can quote me if you’d like ;)

- Stewart

So, at least back in 2004, Flickr was concerned about making it too easy to “connect the dots”. I wonder if this still holds true today? Is anyone else worried about this?

I can certainly see Stewart’s point. But I bet with some solid privacy controls, or as Stewart puts it, “opt in” controls, I think a middle ground could be found. Like it or not, sooner or later there will be systems to tie it all together anyway. Might as well preempt it all and put the power into the hands of the users.

UPDATE: Looks like Flickr now exports mbox_sha1sum checksums from their flickr.people.getInfo API call. Someone saw the light. :)