Our Mindfly Blog

Website Design and Development

Random creative design element

Showboating for the Crowd (Pointless Acid3 Scores by Opera and Webkit)

by Kyle Weems 3.April 2008 08:25

Like a slowly bubbling cauldron overseen by a benighted hag, the web developer blogosphere is beginning to boil over with putrid drama. Browser makers are drawing lines in the sand, designers are lobbing opinions out, and comment sections are filling up with bile. What could be causing such acid behavior?

Acid3. (Warning, the Acid3 test may cause certain browsers to crash while they try to jump its many hurdles.)

I've already briefly talked about it before, but in short the Acid3 test is the third in a series meant to serve as a benchmark for browser compliance in the areas of standards and emerging web technologies. Like its predecessors before it, the purpose of the test (in theory) is to encourage browser makers to improve their browsers to the point where they pass the test, and by doing so, make a better product. Think of it as an industry litmus test.

However, it's apparently not as successful in practice as it is in theory. Both the Webkit (Safari) and Opera teams have announced 100/100 scores on public builds for their browsers in regards to these tests. Awesome, right?! The question that seems to follow up on this is why is Mozilla so far behind? Don't they care about these things? (You'll notice there's not a lot of surprise that IE8 isn't coming along as quickly, but such is Microsoft's lot in the world of web browsers).

Mike Shaver spoke up on his blog a few days back discussing this very issue. He makes two major points worth consideration. The first should be obvious: Mozilla is in the endgame of development for Firefox 3, and as such need to focus on releasing as bug-free a browser as possible. Tweaking it for Acid 3 just before finishing up could introduce a number of flaws that would sour their effort. This is completely justifiable, and I think Mozilla should get their product out (I am already salivating) and then focus on improving its Acid 3 score. 

The second point is more drama-filled, and at the center of the firestorm. He feels that at its core Acid3 is missing the point. To quote:

"Ian’s Acid 3, unlike its predecessors, is not about establishing a baseline of useful web capabilities. It’s quite explicitly about making browser developers jump — Ian specifically sought out tests that were broken in WebKit, Opera, and Gecko, perhaps out of a twisted attempt at fairness. But the Acid tests shouldn’t be fair to browsers, they should be fair to the web; they should be based on how good the web will be as a platform if all browsers conform, not about how far any given browser has to stretch to get there"

He goes on in more detail, and I recommend reading through his post if you've got the time and inclination. Ian Hickson, one of the creators of the test, defends his position in a comment to Mike's post that seems to be summed up to the effect of "You're just upset because we released the test just as you're finishing your browser". I think that's a relevant issue, but I think he missed the overall point.

I'm not as knowledgeable as browser makers about how this whole browser testing thing works, so I'll do what any good blogger does: I quote smarter people.

Eric Meyer (a really smart guy) can be accused of being fairly dry when it comes to speaking, but the man gets to the heart of the issue when he talks about the recent Acid3 flareup. A key phrase of his post is:

"The real point here is that the Acid3 test isn’t a broad-spectrum standards-support test. It’s a showpiece, and something of a Potemkin village at that. Which is a shame, because what’s really needed right now is exhaustive test suites for specifications– XHTML, CSS, DOM, SVG, you name it." -- which digs to the very core of the issue.

The point of the Acid tests is not to pass them for the sake of passing them. That's about as useful as Washington's WASL tests (ask a teacher in the state about them some day and see how that goes). The entire goal is to make the web a better place by pushing browsers to improve. As Meyer also says:

"What I disagree with is the idea that if you cherry-pick enough obscure and difficult corners of a bunch of different specifications and mix them all together into a spicy meatball of difficulty, it constitutes a useful test of the specifications you cherry-picked. Because the one does not automatically follow from the other. For example, suppose I told you that WebKit had implemented just the bits of SMIL-related SVG needed to pass the test, and that in doing so they exposed a woefully incomplete SVG implementation, one that gets something like 2% pass rates on actual SMIL/SVG tests. Laughable, right? Yes, well." (emphasis mine)

How do the people involved in the Acid3 scorecard race respond? Opera's Anne van Kesteren writes in his blog:

"As for the complaining about the test. It’s certainly true that if vendors don’t take it seriously, it won’t be relevant for the Web... complaining about it now some browsers are passing seems a bit lame." -- And, to show how strong he feels his position is, he's been locking comments on all his recent posts about the subject. Good show, ol' chap.

Ultimately, this whole horserace that Opera and Webkit are involved in is a pointless publicity stunt that demeans the purpose of the Acid test and what it's meant to do for the browser community. As Ian admitted himself in his comment to Mike Shaver, "With Acid2, the original “first cut” failed a lot in IE, Mozilla, and Safari, but actually did pretty well in Opera. We (Håkon and I) then went on a hunt for Opera bugs and made Opera fare much worse on the test." (again, emphasis is mine)

Acid3 isn't going to be any different. Why go for a horserace for test compliance at the cost of introducing bugs that you will later have to fix (thereby lowering your initial fancy score)? I'd rather see all the browsers have low scores for a longer period of time if it meant that when they finally passed it was with stable, usable browsers that didn't have other flaws. Opera, Webkit, shame on you both. Stop acting like showboating highschoolers and get to work on passing the test legitimately without ignoring the gaping holes you're currently creating in the process.

 

 

Comments

Anne van Kesteren

Anne van Kesteren said on April 3, 2008 (11:58)...

What's there to discuss? My posts have mainly been announcements and pointers. As for our gaping holes, care to provide pointers to test cases?


Kyle

Kyle said on April 3, 2008 (13:06)...

Anne,

Regarding discussion: What is there to discuss? Apparently a lot. First, as noted by Mr. Shaver and Mr. Meyer in their posts (and many others around the world), there's some serious question as to the applicability of the test itself. Secondly, there's the question on how stable the Opera build will prove to be outside of the Acid3 test. You make a statement that is a response to the criticism of Acid3, then close off the opportunity for people to respond to your response. It is, at best, a tad hypocritical, and comes across somewhat like a group of individuals patting themselves on their back while wearing earplugs. But hey, it's your blog, please feel free to operate it as you see fit.

Regarding Opera's "holes", you are correct that I don't know of any. I'm not a browser maker, nor a creator of test cases. However, as Ian himself pointed out, the problems with Opera when it was rushed to Acid2 compliance, and the current problems known with Webkit, it's not unreasonable to assume that something is rattling around loose inside.

Ultimately, my point was and is that Webkit and Opera's race has provided little in the way of improving your actual public products (instead merely experimental browsers), and has shown that the test can be "cheated" as Webkit's 2% pass rates on actual SMIL/SVG tests shows. I can't see the flag waving providing any long-term benefit to the browser community whatsoever.


Anne van Kesteren

Anne van Kesteren said on April 3, 2008 (13:18)...

Comments: People can always comment on their blog, as you did. I'm not willing to host all discussions as it takes some time moderating and all.

Opera: What problems with Opera when it was rushed for Acid2 compliance? What Ian wrote is that he and howcome went to additional lengths to test known bugs of Opera in Acid2. I don't know of any collateral damage you seem to be speaking of.

As for improving our browsers, without Acid3 it would probably have taken far longer for us to support downloadable fonts, opacity in color values, etc. It might also have taken longer to fix several longstanding architetural bugs Acid3 uncovered. Once all browsers commit to Acid3 there will be a new baseline for Web authors to code against. Some of the tests added by the Acid3 competition are indeed not testing things throughly, but the only one that seems to have come up is SVG Animation so far so I'm rather unconvinced that Acid3 is suddenly worthless given that there was only one contributed test testing that aspect.


Kyle

Kyle said on April 4, 2008 (08:31)...

Anne,

All good points.

Re: Opera and Acid2 - My point was that the initial high score Opera had on Acid2 was essentially meaningless, due to the fact that correcting several bugs in the browser after reaching that score resulted in it lowering. Yes, the score was raised back to perfect after that, but it's that initial phase with a high score followed by a plummet during bug fixes that illustrates the problem I see with these early Acid3 races.

It's my impression that we'll see the same thing with these Acid3 scores. You've waved the flag with your experimental builds, but how long until Kestrel is getting a 100/100 with all the other bells and whistles? I don't see the purpose of celebrating scores (other than PR) until the actual consumer products are getting those scores.

I think that Acid3 tests very useful features that most developers are hoping to see industry-wide support with. However, due the fact that it is essentially cherry-picking things to test it's not as amazing of a measuring stick as announced. Webkit's implementation of SMIL support to pass the Acid3 test (yet fails to pass SVG animation tests as a result) means that we're seeing perfect scores that directly result in bad functionality elsewhere. Is that a single issue with the test? Yes. But if a bad implementation of the feature being tested passes, what does the test mean at that point?

Is Acid3 a good thing? Yes. But whereas I should moderate my initial position that Opera and Webkit 100/100 scores aren't completely pointless, they serve little value to the community until those scores are occurring in the wild, not on a test platform.


Anne van Kesteren

Anne van Kesteren said on April 4, 2008 (11:18)...

I don't really understand what point you're trying to make here. Browsers have bugs, there be dragons? Smile Most Acid tests actually start out by creating a reference rendering in some browser and than addding lots of known bugs to it so it looks completely broken. I don't think Opera ever had an "initially" high score in Acid2 except for internal versions of the Acid2 test known only to Hixie and howcome.

As for releasing this in consumer products, of course we will, but that takes a bit more. (Firefox for instance passed Acid2 since some time, but until Firefox 3 no released version does.)


Add comment



(Will show your Gravatar icon)  









Live preview

said on May 12, 2008 (08:29)...


 

Powered by BlogEngine.NET 1.2.0.0. Original Design by Heather Alvis.
Sign in

Bellingham, Washington
Copyright © 2007 Mindfly Inc. All Rights Reserved.