Performance measurements… and the people who love them
2025-05-20
Developers have a gut-felt understanding for performance, but that intuition breaks down when systems reach Cloudflare’s scale....
Continue reading »
2025-05-20
Developers have a gut-felt understanding for performance, but that intuition breaks down when systems reach Cloudflare’s scale....
Continue reading »2024-10-31
Post-acquisition, we migrated Baselime from AWS to the Cloudflare Developer Platform and in the process, we improved query times, simplified data ingestion, and now handle far more events, all while cutting costs. Here’s how we built a modern, high-performing observability platform on Cloudflare’s network. ...
2024-06-03
Recently, Cloudflare's Observability team undertook an effort to migrate our existing syslog-ng backed logging infrastructure to instead being backed by OpenTelemetry Collectors. In this post, we detail the process that we undertook, and the difficulties we faced along the way...
2024-05-14
Golang 1.20 introduced support for Profile Guided Optimization (PGO) to the go compiler. This post covers the process we created for experimenting with PGO at Cloudflare, and measuring the CPU savings...
2024-04-05
Today, we’re thrilled to announce that Cloudflare has acquired Baselime, a serverless observability company...
April 04, 2024 1:05 PM
Today we are announcing five updates that put more power in your hands – Gradual Deployments, Source mapped stack traces in Tail Workers, a new Rate Limiting API, brand-new API SDKs, and updates to Durable Objects – each built with mission-critical production services in mind...
March 29, 2024 1:00 PM
Learn how Cloudflare used open-source tools to enhance alert observability, leading to increased resilience and improved on-call team well-being...
January 24, 2024 2:00 PM
Foundations is a foundational Rust library, designed to help scale programs for distributed, production-grade systems...
January 08, 2024 2:00 PM
In this post, we’re going to go over what that looks like, how we achieve high availability, and how we meet our Service Level Objectives (SLOs) while shipping close to a million log lines per second...
September 28, 2023 1:00 PM
Earlier this year, we introduced integrations with Supabase, PlanetScale, Neon and Upstash. Today, we are thrilled to introduce our newest additions to Cloudflare’s Integrations Marketplace – Sentry, Turso and Momento...
March 03, 2023 2:00 PM
Here at Cloudflare we run over 900 instances of Prometheus with a total of around 4.9 billion time series. Operating such a large Prometheus deployment doesn’t come without challenges . In this blog post we’ll cover some of the issues we hit and how we solved them...
January 24, 2023 2:00 PM
At Cloudflare, we take steps to ensure we are resilient against failure at all levels of our infrastructure. This includes Kafka, which we use for critical workflows such as sending time-sensitive emails and alerts....
September 28, 2022 1:00 PM
Cloudflare is excited to announce that we are releasing a free version of Magic Networking Monitoring (previously called Flow Based Monitoring). Magic Network Monitoring receives network flow data from a customer’s router(s) and provides network traffic analytics via Cloudflare’s...
May 19, 2022 3:39 PM
Pint is a tool we developed to validate our Prometheus alerting rules and ensure they are always working...
April 13, 2021 1:00 PM
Cloudflare adds Data Dog, Honeycomb, New Relic, Sentry, Splunk, and Sumologic as observability partners to the Cloudflare Workers Ecosystem...
January 14, 2021 12:00 PM
In this article, we will discuss one of the techniques we use to fight such software complexity: simulations. Simulations are basically system tests that run with synthesized customer traffic and applications....