Iβm not sure I can explain it better than the links I gave above, but the short version is that using the MySQL or Postgres table types in ClickHouse means that ClickHouse is connecting out to those servers, not the other way around. So as long as your CH container can reach the MySQL or Postgres server you should be ok (and if not, we probably canβt help you with networking here).
The steps to do a full test are fairly simple (assuming MySQL for below):
Set up a database / table to write to and a user with permissions to write (Iβm using root below for simplicity):
mysql> create database clickhouse_test_db;
Query OK, 1 row affected (0.01 sec)
mysql> use clickhouse_test_db;
Database changed
mysql> create table test_verb (id varchar(64) not null, verb varchar(255) not null);
Query OK, 0 rows affected (0.02 sec)
Then set up ClickHouse to connect to the given table:
ba4310497a7e :) CREATE TABLE IF NOT EXISTS default.mysql_test
(
id varchar(64) not null,
verb varchar(255) not null
) ENGINE = MySQL('mysql-host:3306', 'clickhouse_test_db', 'test_verb', 'root', '<root password>');
CREATE TABLE IF NOT EXISTS default.mysql_test
(
`id` varchar(64) NOT NULL,
`verb` varchar(255) NOT NULL
)
ENGINE = MySQL('mysql-host:3306', 'clickhouse_test_db', 'test_verb', 'root', '<root password>')
Query id: 7c8dc4b5-97f6-4c42-a8de-ca5f8fbe4459
Ok.
0 rows in set. Elapsed: 0.003 sec.
Check that the MySQL table is empty (queried from ClickHouse):
ba4310497a7e :) select * from default.mysql_test;
SELECT *
FROM default.mysql_test
Query id: 583509dc-df23-4e1b-910e-054d5b064441
Ok.
0 rows in set.
Elapsed: 0.012 sec.
Insert some data into MySQL from ClickHouse (this presumes you have data in ClickHouse xapi.xapi_events_all_parsed if itβs a lot of data you may just want to throw a LIMIT on the query):
ba4310497a7e :) insert into default.mysql_test (id, verb) select event_id, verb_id from xapi.xapi_events_all_parsed;
INSERT INTO default.mysql_test (id, verb) SELECT
event_id,
verb_id
FROM xapi.xapi_events_all_parsed
Query id: 626bd10e-330d-4692-8628-e9974f961baf
Ok.
0 rows in set. Elapsed: 0.134 sec. Processed 11.00 thousand rows, 195.99 KB (82.27 thousand rows/s., 1.47 MB/s.)
Peak memory usage: 748.37 KiB.
Select from MySQL in ClickHouse again:
ba4310497a7e :) select * from default.mysql_test limit 5;
SELECT *
FROM default.mysql_test
LIMIT 5
Query id: b4b5ef90-8e86-4d90-a32b-6cdfa709c297
ββidββββββββββββββββββββββββββββββββββββ¬βverbββββββββββββββββββββββββββββββββββββββ
1. β 529c2e05-013d-49a3-b79d-7250f65422f3 β http://adlnet.gov/expapi/verbs/attempted β
2. β 1eb33300-203a-40ed-8084-f18ff1b10e2e β http://adlnet.gov/expapi/verbs/attempted β
3. β d2fb3795-3228-43be-9f94-d27e1b6bd339 β http://adlnet.gov/expapi/verbs/attempted β
4. β a674a8fd-8baf-4052-b578-0b13024f4009 β http://adlnet.gov/expapi/verbs/attempted β
5. β 55a484af-9f77-416a-87ce-655f07ba62f7 β http://adlnet.gov/expapi/verbs/attempted β
ββββββββββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββ
5 rows in set. Elapsed: 0.007 sec. Processed 11.00 thousand rows, 1.04 MB (1.67 million rows/s., 157.55 MB/s.)
Peak memory usage: 11.95 KiB.
And on the MySQL side:
mysql> select * from test_verb limit 5;
+--------------------------------------+------------------------------------------+
| id | verb |
+--------------------------------------+------------------------------------------+
| 529c2e05-013d-49a3-b79d-7250f65422f3 | http://adlnet.gov/expapi/verbs/attempted |
| 1eb33300-203a-40ed-8084-f18ff1b10e2e | http://adlnet.gov/expapi/verbs/attempted |
| d2fb3795-3228-43be-9f94-d27e1b6bd339 | http://adlnet.gov/expapi/verbs/attempted |
| a674a8fd-8baf-4052-b578-0b13024f4009 | http://adlnet.gov/expapi/verbs/attempted |
| 55a484af-9f77-416a-87ce-655f07ba62f7 | http://adlnet.gov/expapi/verbs/attempted |
+--------------------------------------+------------------------------------------+
5 rows in set (0.00 sec)
Thatβs the basics of it. Any query you can write in ClickHouse can then materialize its results to a table in one of the other databases. You just need a mechanism to run those queries on a timer to keep the data fresh.