Get started with Aiven for ClickHouse®
Start using Aiven for ClickHouse® by creating and configuring a service, connecting to it, and loading sample data.
Prerequisites
- Console
- Terraform
- Kubernetes
- ClickHouse client
- Access to the Aiven Console
- Docker installed
- Terraform installed
- A personal token
- Docker installed
- Aiven Operator for Kubernetes® installed
- A personal token
- A Kubernetes secret storing your token
- Docker installed
- ClickHouse CLI client installed
- Docker installed
Create an Aiven for ClickHouse® service
- Console
- Terraform
- Kubernetes
-
From your project, in the Services page, click Create service.
-
From the Select service page, click ClickHouse®.
-
Select the cloud provider and region to host your service on.
noteThe pricing for the same service can vary between different providers and regions. The service summary shows you the pricing for your selected options.
-
Select a service plan.
noteThis determines the number of servers and the memory, CPU, and disk resources allocated to your service. See Plans & Pricing.
-
Optional: Add disk storage.
-
Enter a name for your service.
importantYou cannot change the name after you create the service.
You can fork the service with a new name instead.
-
Optional: Add tags.
-
Click Create service.
The Overview page of the service opens. It shows the connection parameters for your service, its current status, and the configuration options.
The status of the service is Rebuilding during its creation. When the status becomes Running, you can start using the service. This typically takes couple of minutes and can vary between cloud providers and regions.
In this example, an Aiven for ClickHouse service is used to store IoT sensor data. You create the service, two service users, and assign each user a role:
- Give the ETL user permission to insert data.
- Give the analyst user access to view data in the measurements database.
The following example files are also available in the Aiven Terraform Provider repository on GitHub.
-
Create a file named
provider.tf
and add the following:Loading...
-
Create a file named
service.tf
and add the following:Loading...
-
Create a file named
service_users.tf
and add the following:Loading...
-
Create a file named
variables.tf
and add the following:Loading...
-
Create the
terraform.tfvars
file and add the values for your token and project name. -
To output connection details, create a file named
output.tf
and add the following:Loading...
To apply your Terraform configuration:
-
Initialize Terraform by running:
terraform init
The output is similar to the following:
Initializing the backend...
Initializing provider plugins...
- Finding aiven/aiven versions matching ">= 4.0.0, < 5.0.0"...
- Installing aiven/aiven v4.9.2...
- Installed aiven/aiven v4.9.2
...
Terraform has been successfully initialized!
... -
To create an execution plan and preview the changes, run:
terraform plan
-
To deploy your changes, run:
terraform apply --auto-approve
Create an Aiven for ClickHouse service using the Aiven Operator for Kubernetes.
-
Create file
example.yaml
with the following content:apiVersion: aiven.io/v1alpha1
kind: Clickhouse
metadata:
name: my-clickhouse
spec:
authSecretRef:
name: aiven-token
key: token
connInfoSecretTarget:
name: my-clickhouse-connection
userConfig:
service_log: false
project: my-aiven-project
cloudName: google-europe-west1
plan: startup-16
maintenanceWindowDow: friday
maintenanceWindowTime: 23:00:00 -
Create the service by applying the configuration:
kubectl apply -f example.yaml
-
Review the resource you created with the following command:
kubectl get clickhouses my-clickhouse
The output is similar to the following:
Name Project Region Plan State
my-clickhouse my-aiven-project google-europe-west1 startup-16 RUNNING
The resource can stay in the REBUILDING
state for a couple of minutes. Once the state
changes to RUNNING
, you are ready to access it.
Configure the service
You can change your service settings by updating the service configuration.
- Console
- Terraform
- Kubernetes
- Select the new service from the list of services on the Services page.
- On the Overview page, select Service settings from the sidebar.
- In the Advanced configuration section, make changes to the service configuration.
See the available configuration options in Advanced parameters for Aiven for ClickHouse®.
See
the aiven_clickhouse
resource documentation
for the full schema.
-
Update file
example.yaml
:- Add
service_log: true
andterminationProtection: true
. - Update
maintenanceWindowDow: sunday
andmaintenanceWindowTime: 22:00:00
.
apiVersion: aiven.io/v1alpha1
kind: Clickhouse
metadata:
name: my-clickhouse
spec:
authSecretRef:
name: aiven-token
key: token
connInfoSecretTarget:
name: my-clickhouse-connection
userConfig:
service_log: true
project: my-aiven-project
cloudName: google-europe-west1
plan: startup-16
maintenanceWindowDow: sunday
maintenanceWindowTime: 22:00:00
terminationProtection: true - Add
-
Update the service by applying the configuration:
kubectl apply -f example.yaml
-
Review the resource you updated with the following command:
kubectl get clickhouses my-clickhouse
The resource can stay in the REBUILDING
state for a couple of minutes. Once the state
changes to RUNNING
, you are ready to access it.
See the available configuration options in Aiven Operator for Kubernetes®: ClickHouse
Connect to the service
- Console
- Terraform
- ClickHouse client
-
Log in to the Aiven Console, and go to your organization > project > Aiven for ClickHouse service.
-
On the Overview page of your service, click Quick connect.
-
In the Connect window, select a tool or language to connect to your service, follow the connection instructions, and click Done.
docker run -it \
--rm clickhouse/clickhouse-server clickhouse-client \
--user avnadmin \
--password admin_password \
--host clickhouse-service-name-project-name.e.aivencloud.com \
--port 12691 \
--secure
Access your new service with the ClickHouse client using the Terraform outputs.
-
to store the outputs in environment variables, run:
CLICKHOUSE_HOST="$(terraform output -raw clickhouse_service_host)"
CLICKHOUSE_PORT="$(terraform output -raw clickhouse_service_port)"
CLICKHOUSE_USER="$(terraform output -raw clickhouse_service_username)"
CLICKHOUSE_PASSWORD="$(terraform output -raw clickhouse_service_password)" -
To use the environment variables to connect to the service, run:
docker run -it \
--rm clickhouse/clickhouse-client \
--user=$CLICKHOUSE_USER \
--password=$CLICKHOUSE_PASSWORD \
--host=$CLICKHOUSE_HOST \
--port=$CLICKHOUSE_PORT \
--secure
Discover more tools for connecting to Aiven for ClickHouse in Connect to Aiven for ClickHouse®.
Load a dataset
-
Download a dataset from Example Datasets using cURL:
curl https://datasets.clickhouse.com/hits/tsv/hits_v1.tsv.xz | unxz --threads=`nproc` > hits_v1.tsv
curl https://datasets.clickhouse.com/visits/tsv/visits_v1.tsv.xz | unxz --threads=`nproc` > visits_v1.tsvnoteThe
nproc
Linux command, which prints the number of processing units, is not available on macOS. To use the command, add an alias fornproc
into your~/.zshrc
file:alias nproc="sysctl -n hw.logicalcpu"
.Once done, you should have two files:
hits_v1.tsv
andvisits_v1.tsv
. -
Create tables
hits_v1
andvisits_v1
in thedefault
database, which has been created automatically upon the creation of your Aiven for ClickHouse service.Expand for the
CREATE TABLE default.hits_v1
sampleCREATE TABLE default.hits_v1 (
WatchID UInt64,
JavaEnable UInt8,
Title String,
GoodEvent Int16,
EventTime DateTime,
EventDate Date,
CounterID UInt32,
ClientIP UInt32,
ClientIP6 FixedString(16),
RegionID UInt32,
UserID UInt64,
CounterClass Int8,
OS UInt8,
UserAgent UInt8,
URL String,
Referer String,
URLDomain String,
RefererDomain String,
Refresh UInt8,
IsRobot UInt8,
RefererCategories Array(UInt16),
URLCategories Array(UInt16),
URLRegions Array(UInt32),
RefererRegions Array(UInt32),
ResolutionWidth UInt16,
ResolutionHeight UInt16,
ResolutionDepth UInt8,
FlashMajor UInt8,
FlashMinor UInt8,
FlashMinor2 String,
NetMajor UInt8,
NetMinor UInt8,
UserAgentMajor UInt16,
UserAgentMinor FixedString(2),
CookieEnable UInt8,
JavascriptEnable UInt8,
IsMobile UInt8,
MobilePhone UInt8,
MobilePhoneModel String,
Params String,
IPNetworkID UInt32,
TraficSourceID Int8,
SearchEngineID UInt16,
SearchPhrase String,
AdvEngineID UInt8,
IsArtifical UInt8,
WindowClientWidth UInt16,
WindowClientHeight UInt16,
ClientTimeZone Int16,
ClientEventTime DateTime,
SilverlightVersion1 UInt8,
SilverlightVersion2 UInt8,
SilverlightVersion3 UInt32,
SilverlightVersion4 UInt16,
PageCharset String,
CodeVersion UInt32,
IsLink UInt8,
IsDownload UInt8,
IsNotBounce UInt8,
FUniqID UInt64,
HID UInt32,
IsOldCounter UInt8,
IsEvent UInt8,
IsParameter UInt8,
DontCountHits UInt8,
WithHash UInt8,
HitColor FixedString(1),
UTCEventTime DateTime,
Age UInt8,
Sex UInt8,
Income UInt8,
Interests UInt16,
Robotness UInt8,
GeneralInterests Array(UInt16),
RemoteIP UInt32,
RemoteIP6 FixedString(16),
WindowName Int32,
OpenerName Int32,
HistoryLength Int16,
BrowserLanguage FixedString(2),
BrowserCountry FixedString(2),
SocialNetwork String,
SocialAction String,
HTTPError UInt16,
SendTiming Int32,
DNSTiming Int32,
ConnectTiming Int32,
ResponseStartTiming Int32,
ResponseEndTiming Int32,
FetchTiming Int32,
RedirectTiming Int32,
DOMInteractiveTiming Int32,
DOMContentLoadedTiming Int32,
DOMCompleteTiming Int32,
LoadEventStartTiming Int32,
LoadEventEndTiming Int32,
NSToDOMContentLoadedTiming Int32,
FirstPaintTiming Int32,
RedirectCount Int8,
SocialSourceNetworkID UInt8,
SocialSourcePage String,
ParamPrice Int64,
ParamOrderID String,
ParamCurrency FixedString(3),
ParamCurrencyID UInt16,
GoalsReached Array(UInt32),
OpenstatServiceName String,
OpenstatCampaignID String,
OpenstatAdID String,
OpenstatSourceID String,
UTMSource String,
UTMMedium String,
UTMCampaign String,
UTMContent String,
UTMTerm String,
FromTag String,
HasGCLID UInt8,
RefererHash UInt64,
URLHash UInt64,
CLID UInt32,
YCLID UInt64,
ShareService String,
ShareURL String,
ShareTitle String,
ParsedParams Nested(
Key1 String,
Key2 String,
Key3 String,
Key4 String,
Key5 String,
ValueDouble Float64
),
IslandID FixedString(16),
RequestNum UInt32,
RequestTry UInt8
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(EventDate)
ORDER BY
(CounterID, EventDate, intHash32(UserID));Expand for the
CREATE TABLE default.visits_v1
sampleCREATE TABLE default.visits_v1 (
CounterID UInt32,
StartDate Date,
Sign Int8,
IsNew UInt8,
VisitID UInt64,
UserID UInt64,
StartTime DateTime,
Duration UInt32,
UTCStartTime DateTime,
PageViews Int32,
Hits Int32,
IsBounce UInt8,
Referer String,
StartURL String,
RefererDomain String,
StartURLDomain String,
EndURL String,
LinkURL String,
IsDownload UInt8,
TraficSourceID Int8,
SearchEngineID UInt16,
SearchPhrase String,
AdvEngineID UInt8,
PlaceID Int32,
RefererCategories Array(UInt16),
URLCategories Array(UInt16),
URLRegions Array(UInt32),
RefererRegions Array(UInt32),
IsYandex UInt8,
GoalReachesDepth Int32,
GoalReachesURL Int32,
GoalReachesAny Int32,
SocialSourceNetworkID UInt8,
SocialSourcePage String,
MobilePhoneModel String,
ClientEventTime DateTime,
RegionID UInt32,
ClientIP UInt32,
ClientIP6 FixedString(16),
RemoteIP UInt32,
RemoteIP6 FixedString(16),
IPNetworkID UInt32,
SilverlightVersion3 UInt32,
CodeVersion UInt32,
ResolutionWidth UInt16,
ResolutionHeight UInt16,
UserAgentMajor UInt16,
UserAgentMinor UInt16,
WindowClientWidth UInt16,
WindowClientHeight UInt16,
SilverlightVersion2 UInt8,
SilverlightVersion4 UInt16,
FlashVersion3 UInt16,
FlashVersion4 UInt16,
ClientTimeZone Int16,
OS UInt8,
UserAgent UInt8,
ResolutionDepth UInt8,
FlashMajor UInt8,
FlashMinor UInt8,
NetMajor UInt8,
NetMinor UInt8,
MobilePhone UInt8,
SilverlightVersion1 UInt8,
Age UInt8,
Sex UInt8,
Income UInt8,
JavaEnable UInt8,
CookieEnable UInt8,
JavascriptEnable UInt8,
IsMobile UInt8,
BrowserLanguage UInt16,
BrowserCountry UInt16,
Interests UInt16,
Robotness UInt8,
GeneralInterests Array(UInt16),
Params Array(String),
Goals Nested(
ID UInt32,
Serial UInt32,
EventTime DateTime,
Price Int64,
OrderID String,
CurrencyID UInt32
),
WatchIDs Array(UInt64),
ParamSumPrice Int64,
ParamCurrency FixedString(3),
ParamCurrencyID UInt16,
ClickLogID UInt64,
ClickEventID Int32,
ClickGoodEvent Int32,
ClickEventTime DateTime,
ClickPriorityID Int32,
ClickPhraseID Int32,
ClickPageID Int32,
ClickPlaceID Int32,
ClickTypeID Int32,
ClickResourceID Int32,
ClickCost UInt32,
ClickClientIP UInt32,
ClickDomainID UInt32,
ClickURL String,
ClickAttempt UInt8,
ClickOrderID UInt32,
ClickBannerID UInt32,
ClickMarketCategoryID UInt32,
ClickMarketPP UInt32,
ClickMarketCategoryName String,
ClickMarketPPName String,
ClickAWAPSCampaignName String,
ClickPageName String,
ClickTargetType UInt16,
ClickTargetPhraseID UInt64,
ClickContextType UInt8,
ClickSelectType Int8,
ClickOptions String,
ClickGroupBannerID Int32,
OpenstatServiceName String,
OpenstatCampaignID String,
OpenstatAdID String,
OpenstatSourceID String,
UTMSource String,
UTMMedium String,
UTMCampaign String,
UTMContent String,
UTMTerm String,
FromTag String,
HasGCLID UInt8,
FirstVisit DateTime,
PredLastVisit Date,
LastVisit Date,
TotalVisits UInt32,
TraficSource Nested(
ID Int8,
SearchEngineID UInt16,
AdvEngineID UInt8,
PlaceID UInt16,
SocialSourceNetworkID UInt8,
Domain String,
SearchPhrase String,
SocialSourcePage String
),
Attendance FixedString(16),
CLID UInt32,
YCLID UInt64,
NormalizedRefererHash UInt64,
SearchPhraseHash UInt64,
RefererDomainHash UInt64,
NormalizedStartURLHash UInt64,
StartURLDomainHash UInt64,
NormalizedEndURLHash UInt64,
TopLevelDomain UInt64,
URLScheme UInt64,
OpenstatServiceNameHash UInt64,
OpenstatCampaignIDHash UInt64,
OpenstatAdIDHash UInt64,
OpenstatSourceIDHash UInt64,
UTMSourceHash UInt64,
UTMMediumHash UInt64,
UTMCampaignHash UInt64,
UTMContentHash UInt64,
UTMTermHash UInt64,
FromHash UInt64,
WebVisorEnabled UInt8,
WebVisorActivity UInt32,
ParsedParams Nested(
Key1 String,
Key2 String,
Key3 String,
Key4 String,
Key5 String,
ValueDouble Float64
),
Market Nested(
Type UInt8,
GoalID UInt32,
OrderID String,
OrderPrice Int64,
PP UInt32,
DirectPlaceID UInt32,
DirectOrderID UInt32,
DirectBannerID UInt32,
GoodID String,
GoodName String,
GoodQuantity Int32,
GoodPrice Int64
),
IslandID FixedString(16)
)
ENGINE = CollapsingMergeTree(Sign)
PARTITION BY toYYYYMM(StartDate)
ORDER BY
(CounterID, StartDate, intHash32(UserID), VisitID) -
Load data into tables
hits_v1
andvisits_v1
.-
Go to the folder where you stored the downloaded files for
hits_v1.tsv
andvisits_v1.tsv
. -
Run the following commands:
cat hits_v1.tsv | docker run \
--interactive \
--rm clickhouse/clickhouse-server clickhouse-client \
--user USERNAME \
--password PASSWORD \
--host HOST \
--port PORT \
--secure \
--max_insert_block_size=100000 \
--query="INSERT INTO default.hits_v1 FORMAT TSV"cat visits_v1.tsv | docker run \
--interactive \
--rm clickhouse/clickhouse-server clickhouse-client \
--user USERNAME \
--password PASSWORD \
--host HOST \
--port PORT \
--secure \
--max_insert_block_size=100000 \
--query="INSERT INTO default.visits_v1 FORMAT TSV"
-
Query data
Once the data is loaded, you can run queries against the sample data you imported.
-
Query the number of items in the
hits_v1
table:SELECT COUNT(*) FROM default.hits_v1
-
Find the longest lasting sessions:
SELECT StartURL AS URL,
MAX(Duration) AS MaxDuration
FROM default.visits_v1
GROUP BY URL
ORDER BY MaxDuration DESC
LIMIT 10