迁移Oracle HR 示例 schema 到 PostgreSQL

系统的看完一本PG的书:Learning PostgresSQL第二版,现在开始一点实战。

Oracle 示例 schema包括三个:

  • Human Resource(HR)
  • Sales History(SH)
  • CO

其中最小和最简单的是HR,我们就从他开始。

使用的是Oracle 19c的示例 schema,下载命令如下:

bash 复制代码
git clone --depth 1 --branch v19c https://github.com/oracle-samples/db-sample-schemas.git

HR schema的安装脚本为human_resources/hr_main.sql,其主要构成按序为(以下省略.sql后缀):

  1. hr_cre:创建表、序列和约束
  2. hr_popul:发布数据到表
  3. hr_idx:创建索引
  4. hr_code:创建过程对象
  5. hr_comnt:向表和列添加注释
  6. hr_analz:收集统计信息

我们也按以上顺序来迁移,并且文件名与Oracle保持一致。

创建数据库:hr_db

Oracle 示例 schema安装时,前提是库(通常是pdb)已建立,因此我们也需要先建立数据库及cluster一级对象(主要是role)。

sql 复制代码
CREATE ROLE dbadmin
    LOGIN
    CREATEDB
    CREATEROLE
    PASSWORD 'Welcome1';

CREATE ROLE hr LOGIN PASSWORD 'Welcome1';
CREATE ROLE sh LOGIN PASSWORD 'Welcome1';

GRANT hr to dbadmin;
GRANT sh to dbadmin;

\c postgres dbadmin

CREATE DATABASE sampledb
    OWNER dbadmin
    ENCODING 'UTF8'
    LC_COLLATE='en_US.UTF-8'
    LC_CTYPE='en_US.UTF-8'
    TEMPLATE template0;
	
\c sampledb dbadmin

CREATE SCHEMA AUTHORIZATION hr;
CREATE SCHEMA AUTHORIZATION sh;
GRANT CREATE ON SCHEMA hr TO hr;
GRANT CREATE ON SCHEMA sh TO sh;

这里,假设数据库是sampledb,DBA为dbadmin,schema用户为hr。

创建表:hr_cre

其中最主要的就是数据类型NUMBER和VARCHAR2的替换,用以下vi命令:

复制代码
:1,$ s/NUMBER/NUMERIC/g
:1,$ s/VARCHAR2/VARCHAR/g

其他sequence和constraint有小的语法调整,不赘述。

发布数据到表:hr_popul

DML语句都是标准的,没做任何改动。

只是oracle中的REM和prompt语句做了替换,如以下vi命令:

复制代码
:1,$ s/^REM/--/g
:1,$ s/^Prompt/\\echo/g

创建索引:hr_idx

什么都没改。

创建过程对象:hr_code

这是最费时的一部分,好在oracle的过程,函数和触发器比较简单。

添加注释:hr_comnt

这部分只做了微小的改动,如以下vi命令:

复制代码
:1,$ s/'$/';/g

原因是oracle的语法较宽松,一些语句末没有加;,而PG不认。例如:

sql 复制代码
COMMENT ON TABLE regions
IS 'Regions table that contains region numbers and names. Contains 4 rows; references with the Countries table.'

COMMENT ON COLUMN regions.region_id
IS 'Primary key of regions table.'

COMMENT ON COLUMN regions.region_name
IS 'Names of regions. Locations are in the countries of these regions.'

收集统计信息

PG没有整个schema的统计信息收集语句,只好逐个收集:

sql 复制代码
DO $$
DECLARE
    r RECORD;
BEGIN
    FOR r IN
        SELECT tablename
        FROM pg_tables
        WHERE schemaname = 'hr'
    LOOP
        EXECUTE 'ANALYZE hr.' || quote_ident(r.tablename);
    END LOOP;
END$$;

一些说明

oracle的脚步都是显式commit,而PG是autocommit。

sql 复制代码
postgres=# \echo :AUTOCOMMIT
on

由于表之间有复杂的参照一致性,为了避免插入数据出错,将这些约束放到数据插入之后再创建。

成果

sql 复制代码
$ psql -U hr sampledb
Password for user hr:
psql (16.9)
Type "help" for help.

sampledb=> \d
              List of relations
 Schema |       Name       |   Type   | Owner
--------+------------------+----------+-------
 hr     | countries        | table    | hr
 hr     | departments      | table    | hr
 hr     | departments_seq  | sequence | hr
 hr     | emp_details_view | view     | hr
 hr     | employees        | table    | hr
 hr     | employees_seq    | sequence | hr
 hr     | job_history      | table    | hr
 hr     | jobs             | table    | hr
 hr     | locations        | table    | hr
 hr     | locations_seq    | sequence | hr
 hr     | regions          | table    | hr
(11 rows)

sampledb=> select count(*) from employees;
 count
-------
   107
(1 row)


sampledb=> \dv
            List of relations
 Schema |       Name       | Type | Owner
--------+------------------+------+-------
 hr     | emp_details_view | view | hr
(1 row)

sampledb=> \di
                       List of relations
 Schema |          Name           | Type  | Owner |    Table
--------+-------------------------+-------+-------+-------------
 hr     | country_c_id_pk         | index | hr    | countries
 hr     | dept_id_pk              | index | hr    | departments
 hr     | dept_location_ix        | index | hr    | departments
 hr     | emp_department_ix       | index | hr    | employees
 hr     | emp_email_uk            | index | hr    | employees
 hr     | emp_emp_id_pk           | index | hr    | employees
 hr     | emp_job_ix              | index | hr    | employees
 hr     | emp_manager_ix          | index | hr    | employees
 hr     | emp_name_ix             | index | hr    | employees
 hr     | jhist_department_ix     | index | hr    | job_history
 hr     | jhist_emp_id_st_date_pk | index | hr    | job_history
 hr     | jhist_employee_ix       | index | hr    | job_history
 hr     | jhist_job_ix            | index | hr    | job_history
 hr     | job_id_pk               | index | hr    | jobs
 hr     | loc_city_ix             | index | hr    | locations
 hr     | loc_country_ix          | index | hr    | locations
 hr     | loc_id_pk               | index | hr    | locations
 hr     | loc_state_province_ix   | index | hr    | locations
 hr     | reg_id_pk               | index | hr    | regions
(19 rows)

sampledb=> \dn
      List of schemas
  Name  |       Owner
--------+-------------------
 hr     | hr
 public | pg_database_owner
(2 rows)

sampledb=> \ds
              List of relations
 Schema |      Name       |   Type   | Owner
--------+-----------------+----------+-------
 hr     | departments_seq | sequence | hr
 hr     | employees_seq   | sequence | hr
 hr     | locations_seq   | sequence | hr
(3 rows)

sampledb=> \df
                                                                                    List of functions
 Schema |            Name            | Result data type |                                                   Argument data types
       | Type
--------+----------------------------+------------------+-------------------------------------------------------------------------------------------------------------------
-------+------
 hr     | add_job_history            |                  | IN p_emp_id numeric, IN p_start_date date, IN p_end_date date, IN p_job_id character varying, IN p_department_id n
umeric | proc
 hr     | secure_dml                 |                  |
       | proc
 hr     | secure_employees_trigger   | trigger          |
       | func
 hr     | update_job_history_trigger | trigger          |
       | func
(4 rows)

最后,所有的脚本都在Github上了,下一篇我们迁移较复杂的Sales History 示例 schema。