Tools for Continuous Integration at Google Scale (GTAC2010)
Nathan York
29 min, 55 sec
http://www.youtube.com/watch?v=b52aXZ2yi08

Slide Notes:

  • Software Engineering Gap – lots of work on platforms and compilers, lots of work on apps. Middle (build systems, etc.) is often ignored (reach good enough and then move on)
  • Common Build System Issues – Incorrect, Slow, Cumbersome, Under-maintained
  • Why Build Systems Matter – Engineer Productivity, All about feedback
  • The Challenge At Google – 6000 engineers and one code base, everything built from source, development on mainline, extensive automated testing
  • Rough Developer Workflow (flow chart)
  • Better Build System Needed – optimized and tuned build languages, dependency analysis and scheduling, leverage infrastructure. Must be correct AND fast.
  • Inputs, Outputs and Actions – Content addressable storage (by digest of content), use relative paths, eliminate global state
  • Scaling Source Code Access – FUSE based file system. Most code needed for read only, on-demand syncing and caching, all source in the cloud, content digests as metadata
  • Making Builds Fast – Distributed builds in the cloud – built in arbitrary location
  • Scalable Distributed Builds – Caching key to scalable build. If inputs (from digest) and actions are same as previous, return prior result.
  • Scaling Build Outputs – FUSE based file systems, all output in the cloud, shared across builds and users
  • System View – builds appear local but are in the cloud
  • Platform for Automated Testing – Executing a test is just another build action.
  • Results – 20+ code changes per minute, 65K builds per day, 10000 CPUs, 50 TB memory, ~1PB output every 7 days, 94% cache hit rate.
  • Estimating build tool savings 2008 to 2009: Saving ~600 person years
  • Conclusion: build system is a core component of software engineering
  • Questions