In a very popular data loss scenario a table is dropped and empty one is created with the same name. This is because mysqldump in many cases generates the “DROP TABLE” instruction before the “CREATE TABLE”:
DROP TABLE IF EXISTS `actor`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `actor` ( `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT, `first_name` varchar(45) NOT NULL, `last_name` varchar(45) NOT NULL, `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`actor_id`), KEY `idx_actor_last_name` (`last_name`) ) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8; /*!40101 SET character_set_client = @saved_cs_client */;
If there were no subsequent CREATE TABLE the recovery would be trivial. Index_id of the PRIMARY index of table sakila/actor still resides in InnoDB dictionary although marked as deleted.
Knowing index_id, it’s easy to get the table back:
# ./bin/constraints_parser.SYS_TABLES -4Df pages-1345219305/FIL_PAGE_INDEX/0-1 LOAD DATA INFILE '/root/src/recovery-tool/s_tools/dumps/default/SYS_INDEXES' REPLACE INTO TABLE `SYS_TABLES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t' (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`); SYS_TABLES "sakila/actor" 15 4 1 0 0 "" 0 SYS_TABLES "sakila/actor" 15 4 1 0 0 "" 0 # ./bin/constraints_parser.SYS_INDEXES -4Df pages-1345219305/FIL_PAGE_INDEX/0-3| grep 15 | grep PRIMARY LOAD DATA INFILE '/root/src/recovery-tool/s_tools/dumps/default/SYS_INDEXES' REPLACE INTO TABLE `SYS_INDEXES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_INDEXES\t' (`TABLE_ID`, `ID`, `NAME`, `N_FIELDS`, `TYPE`, `SPACE`, `PAGE_NO`); SYS_INDEXES 15 18 "PRIMARY" 1 3 0 4294967295 SYS_INDEXES 15 18 "PRIMARY" 1 3 0 4294967295 SYS_INDEXES 15 18 "PRIMARY" 1 3 0 4294967295 # ln -fs table_defs.h.actor include/table_defs.h # make constraints_parser # ./constraints_parser -5Uf pages-1345219305/FIL_PAGE_INDEX/0-18 actor 1 "PENELOPE" "GUINESS" "2006-02-15 04:34:33" actor 2 "NICK" "WAHLBERG" "2006-02-15 04:34:33" actor 3 "ED" "CHASE" "2006-02-15 04:34:33" actor 4 "JENNIFER" "DAVIS" "2006-02-15 04:34:33" actor 5 "JOHNNY" "LOLLOBRIGIDA" "2006-02-15 04:34:33" ... actor 199 "JULIA" "FAWCETT" "2006-02-15 04:34:33" actor 200 "THORA" "TEMPLE" "2006-02-15 04:34:33"
How does the following “CREATE TABLE” make the recovery any harder?
When a table gets dropped InnoDB deletes respective records from the dictionary – tables SYS_TABLES and SYS_INDEXES and others. Physically the records remain on their places, they’re just marked as deleted. That’s why it was possible to recover them with -D option.
When a user immediately creates the same table InnoDB adds the records to the dictionary. The size of the new records will be the same. Indeed the only variable field in SYS_* is NAME. Hence, InnoDB will put the new records into the same position in the pages. So, table_id and index_id of the dropped table gets overwritten.
mysql> CREATE TABLE actor ( -> actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, -> first_name VARCHAR(45) NOT NULL, -> last_name VARCHAR(45) NOT NULL, -> last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, -> PRIMARY KEY (actor_id), -> KEY idx_actor_last_name (last_name) -> )ENGINE=InnoDB DEFAULT CHARSET=utf8; Query OK, 0 rows affected (0.56 sec) # ./page_parser -f /var/lib/mysql/ibdata1 # ./bin/constraints_parser.SYS_TABLES -4Df pages-1345220873/FIL_PAGE_INDEX/0-1 LOAD DATA INFILE '/root/src/recovery-tool/s_tools/dumps/default/SYS_TABLES' REPLACE INTO TABLE `SYS_TABLES` FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '"' LINES STARTING BY 'SYS_TABLES\t' (`NAME`, `ID`, `N_COLS`, `TYPE`, `MIX_ID`, `MIX_LEN`, `CLUSTER_NAME`, `SPACE`); #
If index_id is unknown there are several ways to learn it.
Grepping for a known string
In particular this example we know that PENELOPE GUINESS was in the table. We can concatenate neighbor strings and search them in the pages:
# grep -r "PENELOPEGUINESS" pages-1345220873/FIL_PAGE_INDEX/ Binary file pages-1345220873/FIL_PAGE_INDEX/0-18/9733-00009792.page matches
Earlier Aurimas explained how to deal with tables with binary only values
This method has drawbacks:
- It assumes a lot of manual work and thus slow
- There is no way to differentiate tables based on values only. For example, if you dropped table actor from db1 and db2 you’ll get two different index_id. Both are valid, but no general way to know which index_id belongs to which database/table.
- Sometime it’s not possible to find the index_id. Either because too many matches or there is no suitable string to grep etc.
Using information from UNDO segment
SYS_TABLES and SYS_INDEXES are normal InnoDB tables. They’re also ruled by MVCC.
When a record is deleted from SYS_TABLES it is copied to the undo segment. A pointer to the old version is stored in the internal field DB_ROLL_PTR.
Unfortunately this pointer is lost when new record is inserted. But the actual old values remain in the undo segment for some time.
Let’s review some headers of undo slot:
PK len (compressed ulint, 1-5 bytes) PK data old values: n_fields (compressed ulint, 1-5 bytes) field_no (compressed ulint, 1-5 bytes) field_len (compressed ulint, 1-5 bytes) field data
We know that sakila/actor is field #0, its length is 12 bytes. So, if we ever meet byte sequence
0x00 0x0C sakila/actor 0x01 0x08
the next eight bytes will be will be table_id.
The same approach is applicable for SYS_INDEXES. We can search for sequence
0x00 0x08 <table_id> 0x01 0x08
and next 8 bytes is index_id.
In revision 69 of Percona Data Recovery Tool for InnoDB two tools are added: s_tables and s_indexes. They scan a given pattern and output next 8 bytes – either table_id or index_id.
Let’s review an example:
# ./s_tables /var/lib/mysql/ibdata1 sakila/actor sakila/actor 15 # ./s_indexes /var/lib/mysql/ibdata1 15 0-18 0-19 #
Apparently 0-18 is PRIMARY index and 0-19 is index idx_actor_last_name.
It is not guaranteed though an undo slot will be flushed on disk. InnoDB pages are modified in memory first and only after some time InnoDB writes them permanently on disk. But the changes a transaction does are written to the redo log right before a successful response to COMMIT(or in a second if innodb_flush_trx_at_commit != 1).
In that case we can scan the redo log:
# ./s_tables /var/lib/mysql/ib_logfile1 sakila/actor sakila/actor 15 # ./s_indexes /var/lib/mysql/ib_logfile1 15 0-18 0-19
Now recovery of sakila/actor becomes trivial again as it was before the “CREATE TABLE”.
The post Recovery after DROP & CREATE appeared first on MySQL Performance Blog.