Friday, June 20, 2014

Well, let's see how less bad 31 beta 2 is for you

OK, beta 2. This is the same Mozilla code drop as beta 1 with the following changes: IME keyboard composition of characters is fixed (accented characters, Japanese, Chinese, Korean, etc.), navigating to an MP3 audio file doesn't crash the browser now, off-main-thread compositing is turned off, generational garbage collection is turned off, and I've tuned the GC timeslice a little more based on some heavy testing this afternoon while hacking up phlegmy yuck from my inflamed respiratory passages.

Of the remaining problems, performance seems a bit better overall. The G5 doesn't care about OMTC, but does benefit from having GGC off. The Sawtooth, iBook and iMac G4 systems did much better with both of them off, so I turned everything off. The stall with Adblock is still there, but is literally about half as long (i.e., I measured it with a stopwatch), so this is an improvement, and it's still one-time-only on all of my test systems.

This is, at least right now in my virus-addled state, as much as I am currently able to do to crank up the browser until I can get better profiling support again. It gets it back to a state approximating Fx29 (minus the bugs). I am going to ship 31, and probably when it reaches release -- the work is done, and staying on 24 is not a viable option. Download it and get used to it. But while you use it, you have a small question and a big question for your homework assignment:

1. It looks like Safari bookmarks import is broken again. It is possible, nay, probable, that Mozilla doesn't test against Safari 4 anymore and possibly not even Safari 5.0. There is no Mozilla bug for this but that might simply be because Mozilla doesn't care about older versions of Safari. I don't know how bookmarks change between versions but it's probably time to consider just ripping this code out since we advise people to use HTML export from Safari anyway. Opinions?

2. Is 31 where we should drop source parity, i.e., fork and add things to 31 rather than trying to keep up? It's far easier to do this at the beginning of an ESR cycle because we can just start adding things we want (HTTP/2, SPDY updates, NSS updates, root certificate updates, ECMA6/HTML5/CSS3 features) as they start rolling out while still having security support from Mozilla, rather than try to catch up on a huge backlog when support runs out (Classilla). I had long thought what would doom us is some 10.6+ specific feature that's critical for the browser, but so far we've been able to hack around all that. Instead, what's more likely to doom us now is that our machines are just getting too slow to handle what Firefox expects to throw at them. The average age of a Power Mac is at least a decade; even the quad G5 is celebrating its eighth birthday. We are already having to cut marquee features to keep browser performance acceptable, and soon we will not be able to cut them: while we might get away with no generational GC for awhile (especially since Mozilla doesn't use it on FxOS devices yet), we already know OMTC will be mandatory soon, and those are just the land mines we know about right now. If we fork and go to feature parity, at least we can keep the browser core reasonably up to date but not have to contend with these issues.

The situation I worry about is that we will struggle up to 38 and have a browser that is crushingly slow on all but the highest-spec G5 systems with no solution for laptops or other G4 computers (let alone the G3), and it will be much harder to backport the things we want by that time. 31 is already a substantial compromise between performance and future compatibility. Where should that dividing line be? And don't forget that 10.6 support will be fading away too; when 10.6 goes, Mozilla can make even more assumptions about the hardware that won't be true for us. (Right now 10.6 is still holding on to about 20% of the Mac user base, which is an incredibly stable figure, but it's not growing and it's likely to slip as hardware ages, computers are upgraded and Apple stops supporting building against the 10.6 SDK.)

While you think about that, downloads, and updated release notes. There will be one more scheduled beta a couple weeks prior to final release just to put any lingering issues to bed, and in July we'll also do our annual update on the state of the Power Mac userbase. Now I have to go cough up some more nasty mucus, so excuse me.

18 comments:

  1. Below are some performance tests from my PowerBook 1.3 GHz G4, processor set to "highest". Taken with fresh profiles and the latest versions (24.6.0, 29.0, 31.0b2). I hope this is as objective as can be. I'm not going to comment on the numbers, just trying to find out what happened between 29 and 31. Please don't kill the messenger :-)

    BTW dom.max_script_run_time in 31 is 10 by default (down from 40 in 24), which will cause at least one "JavaScript Fail!!!" thread per day on Tender.

    I assume that the performance problem with Adblock in 31 can also be addressed by the Adblock developers (it has happened before).

    Hope you get better soon! Now I'm going to think about my homework.

    ReplyDelete
    Replies
    1. Chris mentioned that the Adblock developers might be able to address the performance issue with TenFourFox 31. Has anyone contacted them? If not, would it be worthwhile and what info should be passed along to them?

      Delete
  2. = Page loading
    (The first measurement each is uncached, the following are cached.)

    Load Facebook newsfeed page
    24: 12s, 10s, 11s, 8s, 10s
    29: 13s, 12s, 12s, 13s, 13s
    31: 24s, 24s, 23s, 23s, 24s

    Load Facebook profile page
    24: 9s, 7s, 7s, 8s, 7s
    29: 12s, 8s, 7s, 9s, 8s
    31: 18s, 15s, 14s, 17s, 14s

    Load Ebay.de start page
    24: 18s, 15s, 13s
    29: 19s, 15s, 13s
    31: 28s, 21s, 23s

    Load Ebay.de search results page
    24: 8s, 8s, 8s
    29: 9s, 9s, 9s
    31: 13s, 15s, 12s


    = Dynamic content loading

    Scroll back two days on Facebook newsfeed with page-down key pressed
    24: 62s, 73s, 66s
    29: 76s, 74s, 60s
    31: 140s, 125s, 134s


    = JavaScript

    Sunspider 1.0.2 (I ran these several times, the results were consistent)

    ==31==
    Total: 6800.5ms +/- 1.1%
    3d: 878.4ms +/- 2.9%
    cube: 381.6ms +/- 6.7%
    morph: 212.7ms +/- 3.9%
    raytrace: 284.1ms +/- 7.9%

    access: 850.0ms +/- 3.3%
    binary-trees: 271.0ms +/- 4.0%
    fannkuch: 266.1ms +/- 6.7%
    nbody: 196.8ms +/- 3.1%
    nsieve: 116.1ms +/- 2.5%
    bitops: 525.6ms +/- 5.2%
    3bit-bits-in-byte: 69.3ms +/- 8.0%
    bits-in-byte: 92.3ms +/- 4.1%
    bitwise-and: 211.2ms +/- 11.3%
    nsieve-bits: 152.8ms +/- 2.2%

    controlflow: 70.4ms +/- 5.7%
    recursive: 70.4ms +/- 5.7%

    crypto: 503.2ms +/- 5.7%
    aes: 236.4ms +/- 7.5%
    md5: 150.1ms +/- 4.1%
    sha1: 116.7ms +/- 15.1%

    date: 942.3ms +/- 3.2%
    format-tofte: 578.9ms +/- 3.8%
    format-xparb: 363.4ms +/- 5.5%

    math: 783.5ms +/- 2.2%
    cordic: 487.6ms +/- 4.0%
    partial-sums: 189.4ms +/- 5.5%
    spectral-norm: 106.5ms +/- 7.0%

    regexp: 107.5ms +/- 3.4%
    dna: 107.5ms +/- 3.4%

    string: 2139.6ms +/- 2.6%
    base64: 192.9ms +/- 8.1%
    fasta: 451.8ms +/- 1.8%
    tagcloud: 656.3ms +/- 1.7%
    unpack-code: 553.4ms +/- 7.9%
    validate-input: 285.2ms +/- 5.7%

    ==29==
    Total: 3951.8ms +/- 1.6%

    3d: 661.4ms +/- 3.3%
    cube: 269.4ms +/- 3.5%
    morph: 181.9ms +/- 9.4%
    raytrace: 210.1ms +/- 4.7%

    access: 676.6ms +/- 4.5%
    binary-trees: 109.0ms +/- 18.1%
    fannkuch: 264.5ms +/- 2.4%
    nbody: 184.9ms +/- 11.5%
    nsieve: 118.2ms +/- 6.7%

    bitops: 524.7ms +/- 3.9%
    3bit-bits-in-byte: 67.9ms +/- 2.7%
    bits-in-byte: 94.6ms +/- 7.5%
    bitwise-and: 209.8ms +/- 7.9%
    nsieve-bits: 152.4ms +/- 2.8%

    controlflow: 68.0ms +/- 5.4%
    recursive: 68.0ms +/- 5.4%

    crypto: 328.6ms +/- 6.7%
    aes: 155.1ms +/- 3.3%
    md5: 91.6ms +/- 14.5%
    sha1: 81.9ms +/- 24.2%

    date: 322.8ms +/- 10.5%
    format-tofte: 190.2ms +/- 3.5%
    format-xparb: 132.6ms +/- 22.2%

    math: 506.1ms +/- 2.8%
    cordic: 216.0ms +/- 3.4%
    partial-sums: 178.5ms +/- 8.2%
    spectral-norm: 111.6ms +/- 10.3%

    regexp: 92.8ms +/- 7.0%
    dna: 92.8ms +/- 7.0%

    string: 770.8ms +/- 3.0%
    base64: 98.4ms +/- 7.7%
    fasta: 188.1ms +/- 10.4%
    tagcloud: 196.6ms +/- 5.1%
    unpack-code: 166.6ms +/- 6.5%
    validate-input: 121.1ms +/- 5.3%

    ==24==
    Total: 4009.2ms +/- 8.0%

    3d: 617.2ms +/- 4.1%
    cube: 237.8ms +/- 7.1%
    morph: 168.5ms +/- 5.5%
    raytrace: 210.9ms +/- 7.8%

    access: 669.0ms +/- 3.9%
    binary-trees: 109.9ms +/- 17.7%
    fannkuch: 274.1ms +/- 2.9%
    nbody: 170.9ms +/- 3.3%
    nsieve: 114.1ms +/- 14.7%

    bitops: 526.1ms +/- 7.0%
    3bit-bits-in-byte: 67.9ms +/- 5.3%
    bits-in-byte: 98.2ms +/- 8.3%
    bitwise-and: 197.6ms +/- 4.5%
    nsieve-bits: 162.4ms +/- 18.4%

    controlflow: 68.7ms +/- 12.6%
    recursive: 68.7ms +/- 12.6%

    crypto: 370.6ms +/- 21.1%
    aes: 201.0ms +/- 17.6%
    md5: 89.2ms +/- 23.4%
    sha1: 80.4ms +/- 31.5%

    date: 372.2ms +/- 31.4%
    format-tofte: 220.3ms +/- 33.0%
    format-xparb: 151.9ms +/- 29.5%

    math: 527.7ms +/- 18.8%
    cordic: 244.8ms +/- 28.5%
    partial-sums: 188.6ms +/- 18.7%
    spectral-norm: 94.3ms +/- 4.3%

    regexp: 86.0ms +/- 3.9%
    dna: 86.0ms +/- 3.9%

    string: 771.7ms +/- 4.1%
    base64: 99.4ms +/- 10.6%
    fasta: 252.8ms +/- 7.5%
    tagcloud: 178.2ms +/- 5.3%
    unpack-code: 143.1ms +/- 5.0%
    validate-input: 98.2ms +/- 10.6%

    ReplyDelete
  3. I'm not going to do anything specific with the page load times. I can't really figure out how much of that is the browser.

    I will look at the JS regression. It may take some time to analyze, but I appreciate the numbers. However, if it requires an extensive change or backout, that means that we will drop source parity for sure. I don't have another large merge in me and this browser consumes too much of my life already.

    ReplyDelete
    Replies
    1. Interestingly I feel quite similar in respect to WebKit - but for a different reason.
      WebKit would be easy to maintain for Leopard if only there were a decent Objective-C++ compiler with C++11 support available for PowerPC OS X - but there isn't yet and I already spent some hundreds of hours in working around that fact.

      Furthermore the WebKit 1 API is going to be released as "WebKit Legacy API" in the next OS X release (they really call it 10.10 in the source code). So it'll be deprecated in 10.10 - and WebKit 538 will be the last release of Leopard WebKit. On the other hand knowing that fact gives me plenty of time to work through those thousands of change sets that are still being committed each month (in fact there are now more commits than at the time when Google and others were still contributing to WebKit), fixing all sort of things, sometimes enhancing functionality,

      So the end of the road for Leopard WebKit is nearly in sight (it will be when they switch to 539 in trunk and 538 gets it's own branch). But work on it will not stop as there are still the challenges of getting the CSS selector JIT and at least the baseline JavaScript JIT working.

      Delete
  4. URL box still not resizing as addon icons are moved to it.

    ReplyDelete
    Replies
    1. Works for me. Are you sure you're not using an add-on that interferes? N.B. you'll get an overflow indicator if you have too many items. http://s3.postimg.org/umczk8cpv/Picture_4.png

      Delete
    2. All I get is the overflow indicator with nothing else. I'll investigate further. Works in 29 though just as you pictured.

      Delete
    3. Doesn't work in safe mode with default theme. I have bookmark folders within other folders as well as single bookmarks. It works on Firefox 31. Any suggestions?

      Delete
    4. Do you use the same window sizes and the same number of items in FF 31 and TFF 31? What happens when you maximize the TFF window?

      Delete
    5. I deleted two the extra folders & it displays correctly. It looks like if you have to many bookmarks it won't display ANY & just gives you the overflow indicator. It won't display some & give you the overflow indicator as it should. I've only tried it with folders but I bet it happens with just bookmark icons also. Your picture shows no overflow indicator, add some & see what happens.

      Delete
    6. In short 29 displays bookmarks AND the overflow indicator when needed. 31 either displays bookmarks OR overflow indicator but not both.

      Delete
    7. The dynamic resizing of the URL box is off kilter in 31. If you use roomy bookmarks & display bookmark name on mouse over the URL box will not dynamically resize & snaps back to full size showing the overflow icon.

      Delete
  5. I suspect (or: hope) that the page load times and the JS regression are connected. Websites for which I disabled JS have about the same load times as in 29.

    Now, homework:

    1) The Safari data importer can be taken out, as have been the importers for Opera and Netscape. Importing bookmarks via html is a method much more likely to work and keep working for any browser, so that's sufficient for me.

    2) Dropping source parity will give Cameron more time to work on features (I'm still hoping for H.264 support…) instead of doing work to get the browser to compile at all. So this will be like coming home from an exciting trip on the stormy rapid release cycle sea into a safe harbor and enjoying the sunset from the Captain's seat with a beer and barbecue. A bit melancholic, but relaxed.

    If 31's JavaScript can be repaired to run at 29's speed again without too much surgery, then – from the performance perspective – I see no reason why it should be 31 specifically where source parity is dropped. However, from a security/ESR maintenance perspective, parting ways at 31 makes perfect sense.

    If JS in 31 can't be repaired, I'd almost recommend to us 29 as the future codebase. The thing is, people will go back to 24 anyway if they see 31's performance on their G3 or 7400 G4, because they are likely to choose usability over security.

    ReplyDelete
  6. While keeping source parity is nice, the overall stability and performance of the browser are more important. I only hope this won't compromise support for add-ons. In the same way I think that dropping bookmarks import from Safari would be a good move, if that means simpliyfying code maintenance, As long as importing from HTML works it won't be a huge loss.

    ReplyDelete
  7. Some figures from a masochistic G3 iMac owner (1GB RAM, original 40GB drive):

    TFF 24:

    Time to window opening: 41 seconds, 13 seconds when cached
    Time to TFF default window fully drawn: 47 secs, 16 secs cached

    AdBlock makes little difference

    31b2:

    Time to window opening: 54 seconds, 23 seconds when cached
    Time to TFF default window fully drawn: 69 secs, 35 secs cached

    31b2 + AdBlock:

    Time to window opening: 36 secs when cached
    Time to TFF default window fully drawn: 49 secs cached
    Then after about 70 seconds I get a consistent 90 second beachball.

    If I was a selfish G3 owner, I would say drop source parity if it's going to drag performance down. But to be honest, as much as it pains me to type this, I'm wondering whether you should be considering G4s as your minimum target platform. I didn't realize just how slow this G3 is getting until I tried loading the Twitter site and my online banking site on it. It's slow on 24 and painful on 31 and TBH I have alternatives (sadly, Windows....my G4 and Intel Macs all died far too soon for me to risk wasting money on more new Apple hardware).

    And sadly this 12-year-old iMac is starting to show its age too....it takes about 10 minutes for the CRT to focus from cold boot, so I wonder how much longer it'll keep going, after which I'll have an even slower B&W G3 to run my old PowerPC software,

    ReplyDelete
    Replies
    1. This is a very interesting comment. I'm not sure what's more important for TFF – the processor (Altivec) or the clock frequency. I have one of the "high-end" G3s speed-wise (800 MHz iBook) and it runs TFF 24 very nicely.

      I did some tests similar to the ones I did above: iBook G3 800 MHz vs. PowerBook G4 "667" MHz (set to reduced processor speed). TFF 31.0.b2 with fresh Firefox profiles.

      - Startup time for the browser is about the same.
      - JS seems to be pure clock-speed; Sunspider is 8765 on the 800 MHz G3 v.s 12301 on the "667" MHz G4.
      - Page loading speed is about 20% slower on the G3, so the clockspeed helps, but the slower-everything-else on the little iBook (like RAM) drags it down again a bit.
      All in all, not too bad for the G3.

      Regarding the web.m regression between 24 and 31: I see this also on the G3. In 24 I get about 2 frames per second, and in 31 I get 1 frame every 2 seconds. So this is not processor-specific or Altivec-related.

      Delete
    2. Mmm...maybe there's still life in the old G3 yet...although maybe not my iMac.

      I wish my 800Mhz iBook was still working, so I could compare it with the 600Mhz iMac. Sadly it seems to be suffering from the infamous iBook motherboard graphics chip issue.

      Delete

Due to an increased frequency of spam, comments are now subject to moderation.