Skip to content

Avoid locking in CREATE/DROP/RENAME TABLE; implement RENAME DATABASE #6787

Closed
@alexey-milovidov

Description

@alexey-milovidov
Member

The idea is to store table data in unique directories that don't contain table name and link to them via symlinks. These directories will be refcounted and may live after table DROP if the table is in use.

Step 1:

Allow special clause UUID in ATTACH TABLE statement that may contain randomly generated UUID for table.

Create DatabaseAtomic that is intended to replace DatabaseOrdinary as the default database engine.

On table creation it will:

  • generate UUID for table;
  • store table metadata as ATTACH query in usual /metadata/database/table.sql file;
  • ATTACH query will contain UUID clause;
  • ATTACH query will contain some placeholder instead of table name. Example: ATTACH TABLE table;
  • create a directory for table data at /store/xxx/xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy/;
  • store is a new directory inside clickhouse path; xxx - is the first three letters of uuid.
  • this directory does not contain table name neither database name;
  • create symlink /data/database/table that resembles the structure of DatabaseOrdinary;
  • set up refcount in memory that is hold by database object.

On table drop it will:

  • remove symlink /data/database/table;
  • remove table metadata (.sql file);
  • decrement refrount that was held by database;

Refcounts for table are stored only in memory.

On table rename it will:

  • rename symlink and metadata file;
  • change table name in it's object in memory; table name is accessed and changed under short lock with a simple mutex.

On database load at startup:

  • the table name is determined by the name of the .sql file.

Additional considerations:

  • the table data may be deleted lazily and deletion can be postponed to some time similar to deletion of data parts in StorageMergeTree;
  • the table data may be left as "garbage" after incorrect server restart; probably it's better to avoid implementing garbage collection at all (or at least avoid to do it automatically);
  • if there is a safety limit on maximum table size to drop, it should limit both the DROP query itself and background deletion;
  • provide a way to naturally use UUIDs as "replica path" for ReplicatedMergeTree tables; the user should not worry about replica path;
  • need to show UUIDs in system.tables;
  • easy possible enhancement is to allow specify different path for "store" for a table;
  • tables for different databases are stored together in "store" - this allows simple moving between databases with RENAME (if the databases have the same engine and tables have the same stores);
  • symlinks are actually unneeded but will be created for easy introspection/debugging.

Step 2:

We want the same for databases to allow RENAME DATABASE. But we cannot create a different engine - we need just to adopt existing catalog:

  • allow database.sql files to contain UUID clause and a placeholder instead of database name: CREATE DATABASE database ENGINE = 'Atomic' UUID = 'yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy';
  • make metadata directory a symlink to /store/xxx/xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy/metadata;
  • make data directory a symlink to /store/xxx/xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy/data;
    (note that if the database is DatabaseAtomic, its data directory will also contain symlinks)

Databases will be created in this way under feature flag (initially disabled by default).
Default database engine (Ordinary, Atomic) is also controlled by a setting.

Activity

deleted a comment from stale on Nov 1, 2019
changed the title [-]RFC: Avoid locking in CREATE/DROP/RENAME TABLE; implement RENAME DATABASE[/-] [+]Avoid locking in CREATE/DROP/RENAME TABLE; implement RENAME DATABASE[/+] on Dec 29, 2019
alexey-milovidov

alexey-milovidov commented on Apr 26, 2020

@alexey-milovidov
MemberAuthor

Next steps: enable Atomic as default database engine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @tavplubix@alexey-milovidov

      Issue actions

        Avoid locking in CREATE/DROP/RENAME TABLE; implement RENAME DATABASE · Issue #6787 · ClickHouse/ClickHouse