Schedule for PostgresLondon 2019 - The evolution of PGConf UK
PostgreSQL 11 is not done yet. In fact, most people have not upgraded to 10 yet! But at this point, we are well past feature freeze, and should know pretty well what's going to be included. This talk will outline some of the bigger and more interesting features to look forward to!
PostgreSQL is a database that heavily relies on functionality provided by an OS. This approach allows the reuse of some best practices and algorithms of utilizing machine resources like memory or CPU time. But on the other hand, it means PostgreSQL dependency on an OS - if you configure your OS it may significantly affect database performance.
In this talk we’ll talk about common techniques of configuring the Linux kernel to work efficiently with PostgreSQL. We’re going to discuss PostgreSQL and kernel internals, some important questions about how they work, and how different options or features of the Linux kernel can help you to manage the high load with PostgreSQL.
Wei Shan Ang
In PostgreSQL, VACUUM reclaims storage occupied by dead tuples. It is needed to maintain MVCC and also to maintain the performance database. In a high volume read-write workload, the default auto-vacuum configuration is not aggressive enough to keep up with the workload. In this talk, I will share about how to tune auto-vacuum and how the parameters affect the performance of auto-vacuum.
Have you ever thought that you needed to be a programmer to do stream processing and build streaming data pipelines? Think again!
Companies new and old are all recognising the importance of a low-latency, scalable, fault-tolerant data backbone, in the form of the Apache Kafka® streaming platform. With Kafka, developers can integrate multiple sources and systems, which enables low latency analytics, event driven architectures and the population of multiple downstream systems. These data pipelines can be built using configuration alone.
In this talk, we’ll see how easy it is to stream data from a database such as PostgreSQL into Kafka using CDC and Kafka Connect. In addition, we’ll use KSQL to filter, aggregate and join it to other data, and then stream this from Kafka out into multiple targets such as Elasticsearch and S3. All of this can be accomplished without a single line of code!
Why should Java geeks have all the fun?
My talk describes how a multi-level hierarchy of database group roles and row-level security policies are used to restrict SQL access to scientific research data.
The Global Ecosystems Monitoring (GEM) network is an international effort to measure and understand forest ecosystems and how these respond to climate change. The GEM database serves as a repository for data gathered through fieldwork, predominantly in tropical forests.
A web-based user interface enables an administrator to assign users to research projects, geographic regions, field sites, and forest plots. The application maintains a multi-level hierarchy of database group roles to represent these objects. To restrict access, row-level security policies use the pg_has_role function to determine whether the current database user inherits membership of the forest plot specified by “plot_code” in each record. This enables a researcher to access all data from South America, or only data from Peru, or only data from specific field sites, for example.
The SQL standard has more than 4300 pages and hundreds of optional features. The number of features offered by different products varies vastly. PostgreSQL implements a relatively large number of them.
In this session I present some standard SQL features that work in PostgreSQL, but not in other popular open-source databases. But when it comes to standard conformance, PostgreSQL doesn’t even need to fear the comparison to its commercial competitors: PostgreSQL also supports a few useful standard SQL features that don’t work in any of the three most popular commercial SQL databases.
Postgres has always had strong support for relational storage. However, there are many cases where relational storage is either inefficient or overly restrictive. This talk shows the many ways that Postgres has expanded to support non-relational storage, specifically the ability to store and index multiple values, even unrelated ones, in a single database field. Such storage allows for greater efficiency and access simplicity, and can also avoid the negatives of entity-attribute-value (eav) storage. The talk will cover many examples of multiple-value-per-field storage, including arrays, range types, geometry, full text search, xml, json, and records.
The world is becoming increasingly complex for professionals working with data. We've using ETL and other "offline" processes to integrate data into different platforms and SGBD's. Those solutions are still great but there are other cases where we need real-time solutions.
This presentation aims to propose alternative solutions such as message queues, middlewares that can replicate data in a heterogeneous setup making different solutions like MySQL, MongoDB, PostgreSQL and others talking to each other. We will see different tools and techniques like PostgreSQL Logical Decoding, Apache Kafka, Tungsten Replicator, etc, and in the end, we will have a heterogeneous setup with PostgreSQL and another SGBD working together.
Chris Ellis & Ian Hiddleston
Migrating a large energy insights PostgreSQL database to the cloud was an interesting challenge, this talk will cover many of the complexities that we faced and how we went about it. Starting with should you move to the cloud or not, stepping through what it takes to get you to flip that switch, we'll take a look at:
* Is the cloud right for you
* Managed services vs host it yourself
* How much does it cost to run PostgreSQL in the cloud
* How to run PostgreSQL in the cloud, or maybe how not too
* How to deal with fail-lover, replication in the cloud
* How do I get my data there
* How do I keep my data synchronized
* What to do about backups
* Some of the many pitfalls, it's not easy you know
* Finally flipping that switch
Patroni is a high availability solution to manage hundreds of databases in the cloud, as well as in traditional data centers. It implements automatic failover and works together with Etcd, Zookeeper, Consul or Kubernetes API to store and retrieve PostgreSQL cluster information in a consistent way ensuring that there is only one leader at a time. Unlike the majority of existing solutions for automatic failover, it requires a minimal effort to configure the HA cluster and supports autodiscovery of new nodes.
In this talk, I will describe how Patroni works, present a live-demo of creating a new high-availability cluster and share experiences of running a few hundreds PostgreSQL HA clusters in Zalando’s Database-as-a-Service infrastructure on EC2 instances and on Kubernetes.
PostgreSQL tarball comes with a bunch of extensions (a.k.a. contrib modules) -- but are they the only ones?
Of course, not.
There are lots of other extensions around, which are very useful in lots of areas.
This talk will mention about those extensions, their usage patterns with some examples.
PostgreSQL now contains many features to assist with running very large databases. Recent enhancements such as Partitioning and Parallel Query are the most visible, yet there are many others.
VLDB features are also useful in the context of Internet of Things (IoT) applications, so this talk also reviews technical features that support and enhance the IoT use case.