Quantum Render, Road to Kona - October 31 through December 2
This is a list of things we will get done before the Hawai'i All Hands.
Development Process:
Quantum render "baseline"
Integrate webrender into libgkrust (https://github.com/jrmuizel/gecko-dev/pull/21)Have a working setup where from m-c one can build and run something that has firefox using webrender, that we can build on top of- Have a way to run some sort of tests on this setup (that actually exercise the WR code)
- Get reftests working
- Get mochitests working - Morris, ethan
Create Dwrite font backend - vladWindows works so people can develop on it - This should be done mchang- Linux works (still some visual issues) <-- does this mean that we should fix the visual issues? or that linux is working sufficiently that we meet the "baseline" already? or something else?
WebRender Features:
- Transformed primitives and clipping regions with masks - Dzmitry
- Sub-pixel AA - Glenn
Split layer scrolling - mrobinson
- Render videos via ImageBridge with current Webrender API (Sotaro)
+ Add WebRenderAsyncImageContainer to get current ImageKey from ImageHost:
+ Add WebRenderCompositor
+ Add Vsync handling
+ Add NotifyImageComposites() handling
+ Linux as the platform - Windows as a stretch goal
Add shared texture support to Webrender (jerry)
- Add shared BufferTexture support to Webrender
- Add shared direct binding Texture support to Webrender
Build infra (Morris)
Separate bindings to different crateAdd --enable-webrender compile optionget all test passed without --enable-webrender (tracking in bug 1316217)
OLDER STUFF OLDER STUFF OLDER STUFF OLDER STUFF OLDER STUFF OLDER STUFF
Quantum Render, the week of October 17th
Summary
- Alternative paths: We are proceeding with the path that involves the most people early, to account for the possibility that shorter paths do not deliver required improvements. While this may lead to "wasted" work, in the sense that we do more than necessary, it covers us better in the scenario where we do need the majority of the work. Since we don't have time to do enough investigations to figure out the minimal amount of work, all the planning is around minimizing the calendar time before we deliver the required improvements, rather than the overall effort.
- Details for the next steps: Get going on Windows. Continue with WR layer manager, add video, texture sharing, IPC and APZ. Groups and leads have been identified. Through Hawaii work week, the goal is to answer technical questions, come up with enough of an architectural description to be able to plan and estimate the remaining work, and identify any potential roadblocks.
- Process changes: working on the name for the approach. It is meant to move the ownership to the team, from individuals, by reducing the time people could spend overlapping, waiting, being blocked or not having enough information to make the right decisions and write the right code. The oversimplified description is "production code only, land something every day, allow for post-landing reviews to reduce blocking." We will remove the "I am waiting for X to land Y", "I have to finish this because people are waiting", "I have to rebase a lot because large change just landed" scenarios, and increase the group ownership of all features.
Original planning and notes
Block out 11am-5pm for meetings and discussions in larger groups.
Monday
- 11am - 1pm Compositor process: design, progress, implications. By the end of the session, all have a general understanding of this work.
- 1pm - 5pm: QuantumRender overview - options, progress, plan. By the end of the day, we have a single (general) path to follow to Hawaii All Hands, and a list of items to discuss post V1.
- Background: everybody has a build
- CAN WE RENAME WEBRENDER?! To what? Splatt
er
Tuesday
- 11am - 1pm: How we work, and how that may have to change. By the end, we have the ground rules and commitment on following them.
- 1pm - 5pm: Next wave in Quantum Render (texture sharing, video, Windows support), and detailed breakdown of the work for the next three months.
- Background: Decide how we want to communicate and track the details of the project.
Wednesday
- 11am - 1pm: How do we measure performance? Proposal from Google [1], what makes sense for us when comparing Gecko and Quantum.
- 1pm - 3pm: The rest of the roadmap. Do we have the right breakdown? Is there a better one? Do we agree on approximate sizing?
- 3pm: Tooling. More details on the tools that got mentioned in Tuesday AM conversation
Thursday, Friday
- Take the "next wave" breakdown, and start the coding. See if we can find the holes in the arguments, something we may have missed.
- Background: UX has a set of meetings, Vlad will coordinate a meeting (or two) with them.
- Suggested groups:
- WRLayerManager - Jeff, Mason?, Markus?, Dzmitry?, Morris?, etc. - https://public.etherpad-mozilla.org/p/wr-plan
- IPC, Texture sharing, Video - Peter, Jerry?, Nicolas?, Sotaro?, etc.
- APZ: Kats, Botond, etc.
- WebRender completeness
By the time the week is done:
- everybody knows what they're working on through Hawaii;
- we have updated estimates for WRLayerManager, Texture sharing, Video, APZ
[1] https://docs.google.com/document/d/1Owfs6arciEnWgT2-8bWCcHdYRIKRKZ0Xj8UtqRx4c3k/edit#heading=h.qhaqo8u1r0ge
For Firefox:
- Pro
- We're not getting the best out of what we have, could still have room for optimization
- Could get better at invalidation, avoiding drawing commands that use intermediate surfaces / group that make us slow
- Painting / invalidation for many pages are not a bottleneck (we have telemetry data on this)
- Strictly an improvement path
- Very risky to assume WR will make rasterizing future proof. Can become "good enough"
- Con
- Really hard in current system to reason about what would actually happen with an improvement, might improve in one thing, regress in another
- Very strongly coupled system
For Chrome
- Pro
- Relatively well proven to raster / paint pretty fast.
- Architecture isn't dramatically different from what we have
- We already have Skia so fundamentally wouldn't have to change backends
- Responsiveness wins due to many things being off main thread
- Data according to Chrome's own paint times show not much more improvement overhead for most pages (random spikes notwithstanding)
- Con
- Could be a class of content that fundamentally cannot be drawn fast with this architecture
- Would not fit in with the "leapfrog competition"
- Inefficient/zero GPU usage
- Questions about ability to scale appropriately with increased dpi/resolution screens
For WebRender
- Pro
- No invalidation, but it could be added. Invalidation could be a caching scheme
- Made for high resolution screens
- Even if GPU takes longer than CPU, it's a different piece of hardware so CPU can do something else
- Shouldn't have any performance cliff, gives content scalability
- fewer isolated pipeline steps
- Flash is the example that this could go fast
- Con
- Not sure how we'd do if we require software rasterization
- Non GPU like printing might be easier with the current system
- Constructing the display list might be expensive everytime
- integration complexity, also with nothing for a while (Jeff oped thinks the API surface isn't big enough to be a big deal)
- Lots of GPU driver bugs that we'd hit into
- Difficulty detecting correctness bugs with drivers
Discussions that we parked: https://public.etherpad-mozilla.org/p/gfx-parking-lot
OLDER STUFF OLDER STUFF OLDER STUFF OLDER STUFF OLDER STUFF OLDER STUFF
WebRender in Gecko the week of August 1st
https://public.etherpad-mozilla.org/p/wr-plan
webrender in gecko - https://public.etherpad-mozilla.org/p/Gecko-WebRenderer
What topics should we cover, perhaps with subsets of people?
- IPDL mess with parent/child and GPU process
- WebRender and Rust pipes instead of IPDL?
- Should we do it on Android first?
- GPU process, compositor process and WebRender - do we need it?
- Compositor process different approach, making APZ easier?
- WebRender, display list items, only six kinds, hierarchy, types, connection, not losing information, ...
- Crash reports etc. with rust - it all just works?
- Non-OGL backends - DirectX? Vulcan?
- Difference between Bas' scene graph and WebRender
- WebRender in rust vs C++
- crash reports
- Low level support libraries in Servo (e.g., bla's)
- CSS animation
- things with less than wonderful names
- security and memory safety
Process separation (for content only, canvas excluded ATM)
- Abstracting input seems like a noble goal
- A perhaps ideal model for the future: (Chrome process, Content process -> Compositor, GPU, WR, APZ, Input)
- A disadvantage of this approach is that if this process crashes you'd lose your window so you'd still need to hold the HWND remotely to survive crashes
- Ship over display items from content/chrome and the Webrender Process does the rest
- Having input events on the same process as APZ is faster since sometimes content needs to block for a round trip ipc message from APZ for hit testing
- No one had a solid case for using D3D11/GL or WebRender GPU command buffers across processes
- seems better to use WR DisplayItems as the cross-process IPC communication
- Quantum Rendering Strawman
- https://docs.google.com/document/d/1w8t-iuwmMzf05nk6m1Pkg_GBFxsqaUTB_WRPjKMZNWw/edit#heading=h.q4hj3ync9q7i
Quantum notes
- glyph caches are shared across pages in WR
Agenda draft, with spaces to fill:
Monday
- bz, Glenn, Jeff, dvander, Bas, Mason, Milan
- Get together
- Mason can walk us through what he did
- Glenn can tell us about some stuff
- Get everybody set up with a Servo+WebRender2 build
- what are the tough topics?
Tuesday
- +Vlad, +Patrick, bz, Glenn, Jeff, dvander, Bas, Mason, Milan
- AM: go through Vlad’s document
- Identify the top nasty issues awaiting us?
Wednesday
- -Patrick? | Vlad, bz, Glenn, Jeff, dvander, Bas, Mason, Milan
- Process some of the stuff from yesterday
- Maybe some prototyping or investigations
- Examine current webrender numbers
Thursday
- +dbaron, Vlad, Patrick, bz, Glenn, Jeff, dvander, Bas, Mason, Milan (everybody!)
- conclude things about Vlad’s document - any things that need push behind investigation
Friday
- -bz, -Patrick, -dbaron | Vlad, Glenn, Jeff, dvander, Bas, Mason, Milan
- next steps, who’s doing what and when
- early afternoon end
---------------------------------------------------------------
What do we measure?
Animations that are bad today. Slide out animations. Something web authors can also understand, No cliffs. How do we get the telemetry like data for heuristics in the slow paths.
- Measurements
- Framerate at which composition happens (including frame timings, dropped frames etc.)
- Framerate at which layout actually happens (different because of APZ)
- Input latency (input -> on screen) (timings -- not just average)
- Time of first draw
- Time of arbitrary draw
- Need to be able to compare to past/origin builds (not just yesterday's, but a specific golden build)
- Simulated full browsing session (multiple tabs, open/close, etc. -- both fixed and random order)
- due to caching behaviours of different pages
- Restart of GPU process
- Immediate Measurements
- Servo/WR perf with software (llvmpipe and swiftshader)
- Servo/WR vs Gecko on Windows with modern gfx (d2d/d3d11)
- Pull out pure frame rasterization numbers out of the above
-----------------------------------------------------------------------
Summary of Tuesday (Aug 2) afternoon discussion:
The pieces that need to happen in general are the following; these can sort of happen in parallel:
- Output WebRender display items (WRDIs) from Gecko display items.
- Implement whatever new WRDI types we need.
- Implement WebRender optimizations of various sorts.
A plausible incremental path is a follows:
- Implement C struct generation from Rust struct declarations, so we can get C declarations of WRDIs generated based on the actual WebRender code. Hook this up to the build system so we have C structs corresponding to WRDIs in Gecko.
- Do the WRDI output from Gecko display items; see whether new WRDI types are needed in the process.
- Add a new way of going from a PaintedLayer to a bitmap by using step 1 to produce WRDIs for the display items in the PaintedLayer and then painting those WRDIs to produce a bitmap.
- (Optional) Move the execution of step 3 to a non-main thread.
- (Optional, if not ready to switch to WebRender backend yet) Change the FrameLayerBuilder to output layers corresponding to WRDIs; change compositor to handle many more layers. Somewhere in here do occlusion culling and other optimizations.
Step 2 and step 3 can somewhat happen in parallel as follows: We can have a way to ask a display item to output WRDIs or signal failure. Then a PaintedLayer can try to map all its display items to WRDIs. If that succeeds, paint using the new codepath. If any of them fail, paint using our existing codepath. If we do things this way, then once step 3 is done (which can happen before step 2 is done), we can set up continuous integration which exercises the new codepath. Conceivably we can even do something like run our existing reftests, but use the old codepath for the test and the new one for the reference or vice versa.
=======================================================
Which things are hard:
- SVG
- Canvas performance in Servo
- MathML
- XUL tree
- tooling!
Which things are large:
- 90-odd display items, some more complicated than others
Things to answer:
- Where do we put the low level utilities that are required by WebRender, but not really graphics specific? We have Euclid, ipc.
- Printing
- APZ - from Java to C++ to Rust
Random thoughts:
- concept of a level of detail
==================================================================
Decisions Made
Initial min hw spec
- opengl es 3.0
- DX 10 feature level
Assume XUL, MathML are needed.
Assume WR doesn't have to deal with native widgets at all.
Things that might be good to prioritise in Servo/WR near future:
Large image support for WR (too large for hardware)
Canvas + cross process texture / context sharing for canvas.
Video support
Software fallback
Animated image support
Progressive display of images
SVG
D3D - maybe (but opengl testing priority) (the biggest advantage for this will be interop with DXVA)
- progressive display of images
- Software fallback
- nested clips
- subpixel aa
==================================================================
What can we use to measure performance? Among other things...
https://s3.amazonaws.com/mozilla-games/emunittest-public/index.html
==================================================================
In order, next steps:
0. Before ramping up a number of people on Servo related tasks
- a. the Windows build needs to be stable and supported as a first class platform
- b. releng and integration of complex rust libraries
- c. what can we lift from Stylo (talk to Bobby)
- Single DI (text?) with WR in Gecko - Boris & Jeff
- OpenGL Windows / Android measurements - Mason - ~= ok, mostly working and nothing stupidly surprisingly bad (nvidia worse than intel iris on mac)
- Decide if SVG is fully supported or rendered to texture - Jeff
- Video in Servo - Sotaro August 17 - (make sure Android builds are working)
- Large textures (overflowing cache and/or maximum hardware size) - Glenn
- Software fallback - Glenn -
- Canvas - Mason
- Current implementation - Canvas -> skia via CPU, passes over to webrender via byte buffer, then webrender teats it as an image, uploads to GPU and render
- New implementation - Canvas -> skia GL -> Skia uploads to GPU, send texture id to webrender, webrender just samples the texture into the appropriate place
- Current approach - Canvas -> Skia GL on content process in gecko -> Skia uploads to GPU -> GPU Process + WebRender in Gecko - output shared texture
- Alternate approach - remote canvas commands to gpu process
- rust-layers in servo/rust-layers - does opengl layer management, some rust implementations of cross process texture sharing
- APZ evaluation - what and how big