This section covers PostHog's data model, ingestion pipeline, ClickHouse setup and data querying. This page provides an overview of how PostHog is structured.
Broad overview
There are only a few systems to consider.
- A website and API for users
- An API for client apps
- A plugin service for processing events on ingestion
- A worker service for processing events in response to triggers
graph LRu[User]sdk[Client Apps/SDKs]ex[Export Sink]dj[Web/API]p[plugin/worker service]c[Celery]ds[(Data stores)]u-->djsdk --> djp --> exdj --> pp <--> dsdj-->cc <--> dsdj <--> ds
Zooming closer
Adding detail reveals the flow between parts of the system.
graph LRu[User]sdk[Client Apps/SDKs]ex[Export Sink]ds[(Data stores)]subgraph Web/APIw[Web]c[Capture API]d[Decide API]cr[cron]ce[Celery]endsubgraph "plugin/worker service"i[Ingestion]a[Async]t[timer]endu-->|views insights and more|wsdk-->|send events|csdk-->|read|dw-->dsc-->|write events|ii-->|onEvent|ai-->|save events|dst-->|onTimer|aa-->|export events|exd-->|e.g. read flags|dsds-->|read events|aw-->|start task|cecr-->|on schedule|cece-->ds
Zoomed right in
flowchart TDsubgraph K8s ["K8s PostHog namespace"]Ingress(Ingress)PG[(Postgres Stateful service)]Kafka[(Kafka Stateful Service)]KafkaEvents[(Kafka Stateful Service)]Redis[(Redis SS)]ServiceLB([Service Load Balancer])ServiceLBReads([Service Load Balancer])subgraph ServicesDB [K8s Services]PGBouncerendsubgraph CH ["ClickHouse Cluster (Operator Managed)"]CH1[(Replica 1 Shard 1)]CH2[(Replica 1 Shard 2)]CH3[(Replica 2 Shard 1)]CH4[(Replica 2 Shard 2)]endsubgraph ZK [K8s ZooKeeper cluster]ZK1ZK2ZK3endsubgraph AppServices [K8s Services]Events(Events Service)App(Web Service)na %% Invisible helper nodeendsubgraph WorkerServices [K8s Services]Plugins[Plugin Service]Worker[Worker Service]endendAppServices --> ServiceLB --> ServicesDB --> PGEvents --> KafkaEvents --> Plugins --> Kafka --Write path--> CHWorkerServices --> ServiceLBReadsWorkerServices --> ServiceLBServiceLBReads --Read path--> CHAppServices --> ServiceLBReadsCH --> ZKCH1 <--> CH2CH3 <--> CH4ClientApps(Client Apps)--- Ingress%% KLUDGE: Use invisible nodes for styling purposesIngress --- na[ ]na --Other traffic--> Appna --Events endpoint --> Eventsstyle na height:0px,width:0pxRedis -.- WorkerServicesAppServices -.- RedisAppServices-.Optional Utilization telemetry.->Telemetry(Posthog License Telemetry service)class K8s,ServicesDB,CH,ZK,AppServices,WorkerServices sgraph;
No communication is needed into or out of this namespace other than the ingress controller for the app and collecting data.