Pg Stat All TablesEdit
Pg_stat_all_tables is a core part of PostgreSQL’s built-in statistics framework. This system view, often accessed as pg_stat_all_tables, aggregates per-table activity and makes it possible for database administrators and developers to understand how each table in the cluster is being used. It is a practical tool for troubleshooting performance issues, validating maintenance work, and guiding schema and query optimization.
Accessible through the statistics collector, the view focuses on table-level activity across the entire database cluster. It complements more selective views such as pg_stat_user_tables (which limits the scope to user-defined tables) and related statistics views that expose different facets of data access patterns. By exposing counts of scans, inserts, updates, and deletes, along with live and dead row estimates and maintenance activity, pg_stat_all_tables provides a broad picture of how the database workload interacts with its storage objects.
Overview
- What it is: A read-only view that surfaces runtime statistics for every table in the current database cluster.
- Why it matters: It helps identify hot spots, heavy-write tables, and opportunities to adjust indexing, vacuuming, and query plans.
- How it differs from related views: It covers all tables in the cluster, whereas pg_stat_user_tables focuses on user-defined tables. Other views in the family expose related statistics from different angles, such as index activity or block I/O.
The data in pg_stat_all_tables is maintained by PostgreSQL’s statistics collector and is updated in real time as operations occur. The counters are cumulative since the last reset and are stored in shared memory. They are typically reset when the server restarts, or explicitly via functions such as pg_stat_reset() (and, for shared counters, pg_stat_reset_shared()). This means the view is excellent for observing trends over a window of time, but it does not provide long-term archival unless you export or snapshot the data externally.
Columns and data types
A typical row in pg_stat_all_tables includes:
- schemaname (text) and relname (text): identify the namespace and the table.
- seq_scan (bigint): number of sequential scans initiated on the table.
- seq_tup_read (bigint): number of rows read via sequential scans.
- idx_scan (bigint): number of index scans initiated on the table.
- idx_tup_fetch (bigint): number of index entries fetched to satisfy index scans.
- n_tup_ins (bigint): number of rows inserted into the table.
- n_tup_upd (bigint): number of rows updated.
- n_tup_del (bigint): number of rows deleted.
- n_tup_hot_upd (bigint): number of HOT updates (updates that do not require new row versions to be stored in indexes).
- n_live_tup (bigint): estimated number of live rows currently in the table.
- n_dead_tup (bigint): estimated number of dead rows that are not yet removed.
- vacuum_count (bigint): how many times the table has been vacuumed.
- autovacuum_count (bigint): how many times autovacuum has touched the table.
- analyze_count (bigint): how many times ANALYZE has run on the table.
- last_vacuum, last_autovacuum, last_analyze, last_autoanalyze (timestamp with time zone): timestamps for the most recent maintenance activity.
The exact set of columns can vary slightly by PostgreSQL version, but the general pattern remains consistent. For a deeper look at block I/O statistics, you’d also consult the companion views in the pg_statio_all_tables family, which expose things like heap_blks_read and heap_blks_hit.
How it is collected
- Data source: The statistics collector process accumulates counters based on table activity observed during query execution, scans, and maintenance operations.
- Scope: The view aggregates information across all tables visible to the current database connection, spanning all schemas unless filtered by your query.
- Persistence: Counters are in-memory and reset on restart (and can be reset on demand with pg_stat_reset or pg_stat_reset_shared). They do not persist across server restarts unless you export them externally.
Example workflows often involve joining pg_stat_all_tables with catalog views such as pg_class (which holds table relations) and pg_namespace (which holds schemas) to present a richer context for drift in usage patterns across schemas and object types.
Practical uses
- Identify hot tables: Sort by seq_scan or n_tup_upd to find tables that drive most of the workload, then consider targeted indexing or query refactoring.
- Assess maintenance impact: Compare vacuum_count and autovacuum_count with n_dead_tup to determine if dead tuples are being removed efficiently.
- Guide index and query design: Tables with high idx_scan but low idx_tup_fetch may indicate selective queries that benefit from different indexing strategies or query rewrites.
- Schedule maintenance: Use last_vacuum and last_analyze timestamps to fine-tune autovacuum settings or to plan manual VACUUM and ANALYZE runs during low-traffic windows.
- Track growth and churn: n_live_tup versus n_dead_tup can reveal tables where growth strategies or partitioning might be warranted.
Common practice is to query pg_stat_all_tables alongside the relation catalog to fetch human-friendly context, for example:
To find the most scanned user tables (excluding system catalogs):
- SELECT schemaname, relname, seq_scan, idx_scan FROM pg_stat_all_tables WHERE schemaname NOT IN ('pg_catalog', 'information_schema') ORDER BY seq_scan DESC LIMIT 10;
To compare activity against a specific table:
- SELECT * FROM pg_stat_all_tables WHERE relname = 'orders' AND schemaname = 'public';
These queries are often paired with pg_stat_user_tables data to focus on user-defined objects, and with pg_statio_all_tables for a view on block I/O characteristics.
Considerations and caveats
- Temporal context: Since the statistics are cumulative since the last reset, interpretations must consider the time window you’re examining. Short windows may exaggerate or hide trends.
- Sampling and accuracy: The statistics are estimates that can be affected by concurrent transactions and autovacuum activity; they are not perfect measurements but are generally reliable for trend detection.
- Security and visibility: Access to these statistics is governed by PostgreSQL’s normal privileges. In many environments, a role with the appropriate read privileges (often via a dedicated role like pg_read_all_stats) is used to limit exposure of catalog details while still enabling performance tuning.
- Complementary data: For a fuller picture of performance, combine pg_stat_all_tables with related views such as pg_stat_database (database-wide stats) and pg_stat_all_indexes (index-level statistics).