迁移Oracle HR 示例 schema 到 PostgreSQL

系统的看完一本PG的书：Learning PostgresSQL第二版，现在开始一点实战。

Oracle 示例 schema包括三个：

Human Resource（HR）
Sales History（SH）
CO

其中最小和最简单的是HR，我们就从他开始。

使用的是Oracle 19c的示例 schema，下载命令如下：

bash 复制代码

git clone --depth 1 --branch v19c https://github.com/oracle-samples/db-sample-schemas.git

HR schema的安装脚本为human_resources/hr_main.sql，其主要构成按序为（以下省略.sql后缀）：

hr_cre：创建表、序列和约束
hr_popul：发布数据到表
hr_idx：创建索引
hr_code：创建过程对象
hr_comnt：向表和列添加注释
hr_analz：收集统计信息

我们也按以上顺序来迁移，并且文件名与Oracle保持一致。

创建数据库：hr_db

Oracle 示例 schema安装时，前提是库（通常是pdb）已建立，因此我们也需要先建立数据库及cluster一级对象（主要是role）。

sql 复制代码

CREATE ROLE dbadmin
    LOGIN
    CREATEDB
    CREATEROLE
    PASSWORD 'Welcome1';

CREATE ROLE hr LOGIN PASSWORD 'Welcome1';
CREATE ROLE sh LOGIN PASSWORD 'Welcome1';

GRANT hr to dbadmin;
GRANT sh to dbadmin;

\c postgres dbadmin

CREATE DATABASE sampledb
    OWNER dbadmin
    ENCODING 'UTF8'
    LC_COLLATE='en_US.UTF-8'
    LC_CTYPE='en_US.UTF-8'
    TEMPLATE template0;
	
\c sampledb dbadmin

CREATE SCHEMA AUTHORIZATION hr;
CREATE SCHEMA AUTHORIZATION sh;
GRANT CREATE ON SCHEMA hr TO hr;
GRANT CREATE ON SCHEMA sh TO sh;

这里，假设数据库是sampledb，DBA为dbadmin，schema用户为hr。

创建表：hr_cre

其中最主要的就是数据类型NUMBER和VARCHAR2的替换，用以下vi命令：

复制代码

:1,$ s/NUMBER/NUMERIC/g
:1,$ s/VARCHAR2/VARCHAR/g

其他sequence和constraint有小的语法调整，不赘述。

发布数据到表：hr_popul

DML语句都是标准的，没做任何改动。

只是oracle中的REM和prompt语句做了替换，如以下vi命令：

复制代码

:1,$ s/^REM/--/g
:1,$ s/^Prompt/\\echo/g

创建索引：hr_idx

什么都没改。

创建过程对象：hr_code

这是最费时的一部分，好在oracle的过程，函数和触发器比较简单。

添加注释：hr_comnt

这部分只做了微小的改动，如以下vi命令：

复制代码

:1,$ s/'$/';/g

原因是oracle的语法较宽松，一些语句末没有加;，而PG不认。例如：

sql 复制代码

COMMENT ON TABLE regions
IS 'Regions table that contains region numbers and names. Contains 4 rows; references with the Countries table.'

COMMENT ON COLUMN regions.region_id
IS 'Primary key of regions table.'

COMMENT ON COLUMN regions.region_name
IS 'Names of regions. Locations are in the countries of these regions.'

收集统计信息

PG没有整个schema的统计信息收集语句，只好逐个收集：

sql 复制代码

DO $$
DECLARE
    r RECORD;
BEGIN
    FOR r IN
        SELECT tablename
        FROM pg_tables
        WHERE schemaname = 'hr'
    LOOP
        EXECUTE 'ANALYZE hr.' || quote_ident(r.tablename);
    END LOOP;
END$$;

一些说明

oracle的脚步都是显式commit，而PG是autocommit。

sql 复制代码

postgres=# \echo :AUTOCOMMIT
on

由于表之间有复杂的参照一致性，为了避免插入数据出错，将这些约束放到数据插入之后再创建。

成果

sql 复制代码

$ psql -U hr sampledb
Password for user hr:
psql (16.9)
Type "help" for help.

sampledb=> \d
              List of relations
 Schema |       Name       |   Type   | Owner
--------+------------------+----------+-------
 hr     | countries        | table    | hr
 hr     | departments      | table    | hr
 hr     | departments_seq  | sequence | hr
 hr     | emp_details_view | view     | hr
 hr     | employees        | table    | hr
 hr     | employees_seq    | sequence | hr
 hr     | job_history      | table    | hr
 hr     | jobs             | table    | hr
 hr     | locations        | table    | hr
 hr     | locations_seq    | sequence | hr
 hr     | regions          | table    | hr
(11 rows)

sampledb=> select count(*) from employees;
 count
-------
   107
(1 row)


sampledb=> \dv
            List of relations
 Schema |       Name       | Type | Owner
--------+------------------+------+-------
 hr     | emp_details_view | view | hr
(1 row)

sampledb=> \di
                       List of relations
 Schema |          Name           | Type  | Owner |    Table
--------+-------------------------+-------+-------+-------------
 hr     | country_c_id_pk         | index | hr    | countries
 hr     | dept_id_pk              | index | hr    | departments
 hr     | dept_location_ix        | index | hr    | departments
 hr     | emp_department_ix       | index | hr    | employees
 hr     | emp_email_uk            | index | hr    | employees
 hr     | emp_emp_id_pk           | index | hr    | employees
 hr     | emp_job_ix              | index | hr    | employees
 hr     | emp_manager_ix          | index | hr    | employees
 hr     | emp_name_ix             | index | hr    | employees
 hr     | jhist_department_ix     | index | hr    | job_history
 hr     | jhist_emp_id_st_date_pk | index | hr    | job_history
 hr     | jhist_employee_ix       | index | hr    | job_history
 hr     | jhist_job_ix            | index | hr    | job_history
 hr     | job_id_pk               | index | hr    | jobs
 hr     | loc_city_ix             | index | hr    | locations
 hr     | loc_country_ix          | index | hr    | locations
 hr     | loc_id_pk               | index | hr    | locations
 hr     | loc_state_province_ix   | index | hr    | locations
 hr     | reg_id_pk               | index | hr    | regions
(19 rows)

sampledb=> \dn
      List of schemas
  Name  |       Owner
--------+-------------------
 hr     | hr
 public | pg_database_owner
(2 rows)

sampledb=> \ds
              List of relations
 Schema |      Name       |   Type   | Owner
--------+-----------------+----------+-------
 hr     | departments_seq | sequence | hr
 hr     | employees_seq   | sequence | hr
 hr     | locations_seq   | sequence | hr
(3 rows)

sampledb=> \df
                                                                                    List of functions
 Schema |            Name            | Result data type |                                                   Argument data types
       | Type
--------+----------------------------+------------------+-------------------------------------------------------------------------------------------------------------------
-------+------
 hr     | add_job_history            |                  | IN p_emp_id numeric, IN p_start_date date, IN p_end_date date, IN p_job_id character varying, IN p_department_id n
umeric | proc
 hr     | secure_dml                 |                  |
       | proc
 hr     | secure_employees_trigger   | trigger          |
       | func
 hr     | update_job_history_trigger | trigger          |
       | func
(4 rows)

最后，所有的脚本都在Github上了，下一篇我们迁移较复杂的Sales History 示例 schema。