This is a feature found in the database Mariadb that makes the datawarehouse modeling and SQL development much easier and readable.
services:
mariadb
image: mariadb:10.9.2
CITYDIM_NOHIST will be our slowly-changing-dimension type 1 (i.e. new value → overwrite)
DEGREEDIM_HIST will be our slowly-changing-dimension type 2 (i.e. new value → new record with a different time span attribution - which, with this technology, is managed behind the scene)
CREATE OR REPLACE table CITYDIM_NOHIST (
id MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name varchar(50) UNIQUE KEY
);
CREATE OR REPLACE table DEGREEDIM_HIST (
id MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name varchar(50) UNIQUE KEY
) WITH SYSTEM VERSIONING;
CREATE OR REPLACE table FACT (
id MEDIUMINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name varchar(50),
city_id MEDIUMINT,
degree_id MEDIUMINT,
email varchar(50)
) WITH SYSTEM VERSIONING;
ALTER TABLE FACT
ADD CONSTRAINT fk_city FOREIGN KEY (city_id) REFERENCES CITYDIM_NOHIST(id),
ADD CONSTRAINT fk_degree FOREIGN KEY (degree_id) REFERENCES DEGREEDIM_HIST(id);
example of past-date parametric query:
-- query1
select
F.id as pid,
F.name as pname,
F.email as email,
C.name as cname,
D.name as dname
from
KIMBALL_CITYDIM_NOHIST as C,
KIMBALL_DEGREEDIM_HIST FOR SYSTEM_TIME AS OF TIMESTAMP ?.ts D,
KIMBALL_FACT FOR SYSTEM_TIME AS OF TIMESTAMP ?.ts as F
where F.degree_id = D.id and F.city_id = C.id
and F.name = ?.nm
example of past-date query from an application:
String past = new Date().format("yyyy-MM-dd HH:mm:ss",TimeZone.getTimeZone('UTC'))
// eg '2018-05-03 07:22:33'
Map row1 = sql.firstRow( query1, [ts:past, nm:'Barbara'])
source code at: https://github.com/a-moscatelli/DEV/blob/main/mariadb_data_versioning/test_hist_query_on_mariadb.groovy
The same queries are supported by the Delta Lake feature of Databricks on a Spark architecture.
df.write.mode("overwrite").format("delta"),saveAsTable("mydb.mytable_delta");
# audit query
display(spark.sql("describe history mytable_delta"));
# queries
display(spark.sql("select * from mytable_delta version as of 3"
# or
display(spark.sql("select * from mytable_delta timestamp as of '2022-03-04T05:06:07.000+08:00'"
references:
postgres time travel (discontinued)
https://www.postgresql.org/docs/6.3/c0503.htm#:~:text=As of Postgres v6.,a short period of time.
the feature was meant to support point-in-time restore.
also:
https://neon.tech/blog/time-travel-with-postgres
... to support db branching.
back to Portfolio